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METHODS OF DIAGNOSIS OF ANGIOGENESIS, COMPOSITIONS 
AND METHODS OF SCREENING FOR ANGIOGENESIS 
MODULATORS 

CROSS-REFERENCES TO RELATED APPLICATIONS 
This application claims priority to USSN 09/784,356, filed February 14 2001; 
USSN 09/791,390, filed February 22, 2001; USSN 60/285,475, filed April 19, 2001, USSN 
60/310,025, filed August 3, 2001, and USSN 60/334,244, filed November 29, 2001, each of 
which is herein incorporated by reference in its entirety. 

FIELD OF THE INVENTION 
The invention relates to the identification of nucleic acid and protein 
expression profiles and nucleic acids, products, and antibodies thereto that are involved in 
angiogenesis; and to the use of such expression profiles and compositions in diagnosis and 
therapy of angiogenesis. The invention further relates to methods for identifying and using 
agents and/or targets that modulate angiogenesis. 

BACKGROUND OF THE INVENTION 
Both vasculogenesis, the development of an interactive vascular system 
comprising arteries and veins, and angiogenesis, the generation of new blood vessels, play a 
role in embryonic development. In contrast, angiogenesis is limited in a normal adult to the 
placenta, ovary, endometrium and sites of wound healing. However, angiogenesis, or its 
absence, plays an important role in the maintenance of a variety of pathological states. Some 
of these states are characterized by neovascularization, e.g., cancer, diabetic retinopathy, 
glaucoma, and age related macular degeneration. Others, e.g., stroke, infertility, heart 
disease, ulcers, and scleroderma, are diseases of angiogenic insufficiency. 

Angiogenesis has a number of stages (see, e.g., Folkman, J.Natl Cancer Inst. 
82:4-6, 1990; Firestein, J Clin Invest. 103:3-4, 1999; Koch, Arthritis Rheum.41:951-62, 1998; 
Carter, Oncologist 5(Suppl l):51-4, 2000; Browder etal, Cancer Res. 60:1878-86, 2000; and 
Zhu and Witte, Invest New Drugs 17:195-212, 1999). The early stages of angiogenesis 
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include endothelial cell protease production, migration of cells, and proliferation. The early 
stages also appear to require some growth factors, with VEGF, TGF-a, angiostatin, and 
selected chemokines all putatively playing a role. Later stages of angiogenesis include 
population of the vessels with mural cells (pericytes or smooth muscle cells), basement 
5 membrane production, and the induction of vessel bed specializations. The final stages of 
vessel formation include what is known as "remodeling", wherein a forming vasculature 
becomes a stable, mature vessel bed. Thus, the process is highly dynamic, often requiring 
coordinated spatial and temporal waves of gene expression. 

Conversely, the complex process may be subject to disruption by interfering 

1 0 with one or more critical steps. Thus, the lack of understanding of the dynamics of 

angiogenesis prevents therapeutic intervention in serious diseases such as those indicated. It 
is an object of the invention to provide methods that can be used to screen compounds for the 
ability to modulate angiogenesis. Additionally, it is an object to provide molecular targets for 
therapeutic intervention in disease states which either have an undesirable excess or a deficit 

15 in angiogenesis. The present invention provides solutions to both. 



SUMMARY OF THE INVENTION 
The present invention provides compositions and methods for detecting or 
modulating angiogenesis associated sequences. 
20 In one aspect, the invention provides a method of detecting an angiogenesis- 

associated transcript in a cell in a patient, the method comprising contacting a biological 
sample from the patient with a polynucleotide that selectively hybridized to a sequence at 
least 80% identical to a sequence as shown in Tables 1-8. In one embodiment, the biological 
sample is a tissue sample. In another embodiment, the biological sample comprises isolated 
25 nucleic acids, which are often mRNA. 

In another embodiment, the method further comprises the step of amplifying 
nucleic acids before the step of contacting the biological sample with the polynucleotide. 
Often, the polynucleotide comprises a sequence as shown in Tables 1-8. The polynucleotide 
can be labeled, for example, with a fluorescent label and can be immobilized on a solid 
30 surface. 

In other embodiments the patient is undergoing a therapeutic regimen to treat a 
disease associated with angiogenesis or the patient is suspected of having an angiogenesis- 
associated disorder. 
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In another aspect, the invention comprises an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1-8. The nucleic acid molecule 
can be labeled, for example, with a fluorescent label, 

In other aspects, the invention provides an expression vector comprising an 
5 isolated nucleic acid molecule consisting of a polynucleotide sequence as shown in Tables 1- 
8 or a host cell comprising the expression vector. 

In another embodiment, the isolated nucleic acid molecule encodes a 
polypeptide having an amino acid sequence as shown in Table 8. 

In another aspect, the invention provides an isolated polypeptide which is 
10 encoded by a nucleic acid molecule having polynucleotide sequence as shown in Tables 1-8. 
In one embodiment, the isolated polypeptide has an amino acid sequence as shown in Table 
8. 

In another embodiment, the invention provides an antibody that specifically 

binds a polypeptide that has an amino acid sequence as shown in Table 8 or which is encoded 
15 by a nucleotide sequence of Tables 1-8 . The antibody can be conjugated or fused to an 

effector component such as a fluorescent label, a toxin, or a radioisotope. In some 

embodiments, the antibody is an antibody fragment or a humanized antibody. 

In another aspect, the invention provides a method of detecting a cell 

undergoing angiogenesis in a biological sample from a patient, the method comprising 
20 contacting the biological sample with an antibody that specifically binds to a polypeptide that 

has an amino acid sequence as shown in Table 8 or which is encoded by a nucleotide 

sequence of Tables 1-8 . In some embodiments, the antibody is further conjugated or fused to 

an effector component, for example, a fluorescent label. 

In another embodiment, the invention provides a method of detecting 
25 antibodies specific to angiogenesis in a patient, the method comprising contacting a 

biological sample from the patient with a polypeptide which is encoded by a nucleotide 

sequence of Tables 1-8. 

The invention also provides a method of identifying a compound that 

modulates the activity of an angiogenesis-associated polypeptide, the method comprising the 
30 steps of: (i) contacting the compound with a polypeptide that comprises at least 80% identity 

to an amino acid sequence as shown in Table 8 or which is encoded by a nucleotide sequence 

of Tables 1-8; and (ii) detecting an increase or a decrease in the activity of the polypeptide. 

In one embodiment, the polypeptide has an amino acid sequence as shown in Table 8 or is a 
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polypeptide encoded by a nucleotide sequence of Tables 1-8. In another embodiment, the 
polypeptide is expressed in a cell. 

The invention also provides a method of identifying a compound that 
modulates angiogenesis, the method comprising steps of: (i) contacting the compound with a 
cell undergoing angiogenesis; and (ii) detecting an increase or a decrease in the expression of 
a polypeptide sequence as shown in Table 8 or a polypeptide which is encoded by a 
nucleotide sequence of Tables 1-8. In one embodiment, the detecting step comprises 
hybridizing a nucleic acid sample from the cell with a polynucleotide that selectively 
hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-8. In 
another embodiment, the method further comprises detecting an increase or decrease in the 
expression of a second sequence as shown in Table 8 or a polypeptide which is encoded by a 
nucleotide sequence of Tables 1-8 . 

In another embodiment, the invention provides a method of inhibiting 
angiogenesis in a cell that expresses a polypeptide at least 80% identical to a sequence as 
shown in Table 8 or winch is 80% identical to a polypeptide encoded by a nucleotide 
sequence of Tables 1-8 , the method comprising the step of contacting the cell with a 
therapeutically effective amount of an inhibitor of the polypeptide. In one embodiment, the 
polypeptide has an amino acid sequence shown in Table 8 or is a polypeptide which is 
encoded by a nucleotide sequence of Tables 1-8 . In another embodiment, the inhibitor is an 
antibody. 

In other embodiments, the invention provides a method of activating 
angiogenesis in a cell that expresses a polypeptide at least 80% identical to a sequence as 
shown in Table 8 or at least 80% identical to a polypeptide which is encoded by a nucleotide 
sequence of Tables 1-8 , the method comprising the step of contacting the cell with a 
therapeutically effective amount of an activator of the polypeptide. In one embodiment, the 
polypeptide has an amino acid sequence shown in Table 8 or is a polypeptide which is 
encoded by a nucleotide sequence of Tables 1-8. 

Other aspects of the invention will become apparent to the skilled artisan by 
the following description of the invention. 

Tables 1-8 provide nucleotide sequence of genes that exhibit changes in 
expression levels as a function of time in tissue undergoing angiogenesis compared to tissue 
that is not. 
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DESCRIPTION OF THE SPECIFIC EMBODIMENTS 
In accordance with the objects outlined above, the present invention provides 
novel methods for diagnosis and treatment of disorders associated with angiogenesis 
(sometimes referred to herein as angiogenesis disorders or AD), as well as methods for 
5 screening for compositions which modulate angiogenesis. By "disorder associated with 

angiogenesis" or "disease associated with angiogenesis" herein is meant a disease state which 
is marked by either an excess or a deficit of blood vessel development. Angiogenesis 
disorders asociated with increased angiogenesis include, but are not limited to, cancer and 
proliferative diabetic retinopathy. Pathological states for which it may be desirable to 

10 increase angiogenesis include stroke, heart disease, infertility, ulcers, wound healing, 

ischemia, and scleradoma. Solid tumors typically require angiogenesis to support or sustain 
growth, e.g., breast, colon, lung, brain, bladder, and prostate tumors. Other AD include, e.g., 
arthritis, inflammatory bowel disease, diabetis retinopathy, macular degeneration, 
atherosclerosis, and psoriasis. Also provided are methods for treating AD. 

15 Definitions 

The term "angiogenesis protein" or "angiogenesis polynucleotide" refers to 
nucleic acid and polypeptide polymorphic variants, alleles, mutants, and interspecies 
homologs that: (1) have an amino acid sequence that has greater than about 60% amino acid 
sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 

20 96%, 97%, 98% or 99% or greater amino acid sequence identity, preferably over a region of 
over a region of at least about 25, 50, 100, 200, 500, 1000, or more amino acids, to an 
angiogenesis protein sequence of Table 8; (2) bind to antibodies, e.g., polyclonal antibodies, 
raised against an immunogen comprising an amino acid sequence of Table 8, and 
conservatively modified variants thereof; (3) specifically hybridize under stringent 

25 hybridization conditions to an anti-sense strand corresponding to a nucleic acid sequence of 
Tables 1-8 and conservatively modified variants thereof; (4) have a nucleic acid sequence 
that has greater than about 95%, preferably greater than about 96%, 97%, 98%, 99%, or 
higher nucleotide sequence identity, preferably over a region of at least about 25, 50, 100, 
200, 500, 1000, or more nucleotides, to a sense sequence corresponding to one set out in 

30 Tables 1-8 . A polynucleotide or polypeptide sequence is typically from a mammal 

including, but not limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, 
horse, sheep, or any mammal. An "angiogenesis polypeptide" and an "angiogenesis 
polynucleotide," include both naturally occurring or recombinant. 
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A "full length" angiogenesis protein or nucleic acid refers to an agiogenesis 
polypeptide or polynucleotide sequence, or a variant thereof, that contains all of the elements 
normally contained in one or more naturally occurring, wild type angiogenesis polynucleotide 
or polypeptide sequences. The "full length" may be prior to, or after, various stages of post- 
5 translation processing. 

"Biological sample" as used herein is a sample of biological tissue or fluid that 
contains nucleic acids or polypeptides, e.g., of an angiogenic protein. Such samples include, 
but are not limited to, tissue isolated from primates, e.g., humans, or rodents, e.g., mice, and 
rats. Biological samples may also include sections of tissues such as biopsy and autopsy 
10 samples, and frozen sections taken for histologic purposes. A biological sample is typically 
obtained from a eukaryotic organism, most preferably a mammal such as a primate e.g., 
chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or a bird; 
reptile; or fish. 

"Providing a biological sample" means to obtain a biological sample for use in 
1 5 methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells [e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 
methods of the invention in vivo. Archival tissues, having treatment or outcome histroy, will 
be particularly useful. 

20 The terms "identical" or percent "identity," in the context of two or more 

nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that 
are the same or have a specified percentage of amino acid residues or nucleotides that are the 
same (i.e., about 70% identity, preferably 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 
96%, 97%, 98%, 99%, or higher identity over a specified region (e.g., SEQ ID NOS:l-229), 

25 when compared and aligned for maximum correspondence over a comparison window or 
designated region) as measured using a BLAST or BLAST 2.0 sequence comparison 
algorithms with default parameters described below, or by manual alignment and visual 
inspection (see, e.g., NCBI web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such 
sequences are then said to be "substantially identical." This definition also refers to, or may 

30 be applied to, the compliment of a test sequence. The definition also includes sequences that 
have deletions and/or additions, as well as those that have substitutions. As described below, 
the preferred algorithms can account for gaps and the like. Preferably, identity exists over a 
region that is at least about 25 amino acids or nucleotides in length, or more preferably over a 
region that is 50-100 amino acids or nucleotides in length. 
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For sequence comparison, typically one sequence acts as a reference sequence, 
to which test sequences are compared. When using a sequence comparison algorithm, test 
and reference sequences are entered into a computer, subsequence coordinates are designated, 
if necessary, and sequence algorithm program parameters are designated. Preferably, default 
5 program parameters can be used, or alternative parameters can be designated. The sequence 
comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of 
any one of the number of contiguous positions selected from the group consisting of from 20 

10 to 600, usually about 50 to about 200, more usually about 100 to about 1 50 in which a 
sequence may be compared to a reference sequence of the same number of contiguous 
positions after the two sequences are optimally aligned. Methods of alignment of sequences 
for comparison are well-known in the art. Optimal alignment of sequences for comparison 
can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. 

15 Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. 
Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l. 
Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms 
(GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, 
Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and 

20 visual inspection {see, e.g., Current Protocols in Molecular Biology (Ausubel et al, eds. 
1995 supplement)). 

A preferred example of algorithm that is suitable for determining percent 
sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which 
are described in Altschul et al, Nuc. Acids Res. 25:3389-3402 (1977) and Altschul et al, J. 

25 Mol. Biol. 2 1 5 :403-4 1 0 (1 990), respectively. BLAST and BLAST 2.0 are used, with the 

parameters described herein, to determine percent sequence identity for the nucleic acids and 
proteins of the invention. Software for performing BLAST analyses is publicly available 
through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). 
This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying 

3 0 short words of length W in the query sequence, which either match or satisfy some positive- 
valued threshold score T when aligned with a word of the same length in a database 
sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra). 
These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs 
containing them. The word hits are extended in both directions along each sequence for as 
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far as the cumulative alignment score can be increased. Cumulative scores are calculated 
using, for nucleotide sequences, the parameters M (reward score for a pair of matching 
residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino 
acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the 
5 word hits in each direction are halted when: the cumulative alignment score falls off by the 
quantity X from its maximum achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 

10 uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)) alignments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

15 The BLAST algorithm also performs a statistical analysis of the similarity 

between two sequences (see, e.g., Karlin & Altschul, Proc. Nat'J. Acad. Sci. USA 90:5873- 
5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest 
sum probability (P(N)), which provides an indication of the probability by which a match 
between two nucleotide or amino acid sequences would occur by chance. For example, a 

20 nucleic acid is considered similar to a reference sequence if the smallest sum probability in a 
comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more 
preferably less than about 0.01, and most preferably less than about 0.001. 

An indication that two nucleic acid sequences or polypeptides are substantially 
identical is that the polypeptide encoded by the first nucleic acid is immunologically cross 

25 reactive with the antibodies raised against the polypeptide encoded by the second nucleic 
acid, as described below. Thus, a polypeptide is typically substantially identical to a second 
polypeptide, for example, where the two peptides differ only by conservative substitutions. 
Another indication that two nucleic acid sequences are substantially identical is that the two 
molecules or their complements hybridize to each other under stringent conditions, as 

30 described below. Yet another indication that two nucleic acid sequences are substantially 
identical is that the same primers can be used to amplify the sequences. 

A "host cell" is a naturally occurring cell or a transformed cell that contains an 
expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
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prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 
Collection catalog or web site, www.atcc.org). 

The terms "polypeptide," "peptide" and "protein" are used interchangeably 
5 herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers 
in which one or more amino acid residue is an artificial chemical mimetic of a corresponding 
naturally occurring amino acid, as well as to naturally occurring amino acid polymers and 
non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, 

10 as well as amino acid analogs and amino acid mimetics that function in a manner similar to 
the naturally occurring amino acids. Naturally occurring amino acids are those encoded by 
the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is 

15 bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 
norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified 
R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical 
structure as a naturally occurring amino acid. Amino acid mimetics refers to chemical 
compounds that have a structure that is different from the general chemical structure of an 

20 amino acid, but that functions in a manner similar to a naturally occurring amino acid. 

Amino acids may be referred to herein by either their commonly known three 
letter symbols or by the one-letter symbols recommended by the IUPAC-iUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

25 "Conservatively modified variants" applies to both amino acid and nucleic 

acid sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical sequences. Because of the degeneracy of the genetic code, a large 

30 number of functionally identical nucleic acids encode any given protein. For instance, the 
codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every 
position where an alanine is specified by a codon, the codon can be altered to any of the 
corresponding codons described without altering the encoded polypeptide. Such nucleic acid 
variations are "silent variations," which are one species of conservatively modified 
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variations. Every nucleic acid sequence herein which encodes a polypeptide also describes 
every possible silent variation of the nucleic acid. One of skill will recognize that each codon 
in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, 
which is ordinarily the only codon for tryptophan) can be modified to yield a functionally 
5 identical molecule. Accordingly, each silent variation of a nucleic acid which encodes a 

polypeptide is implicit in each described sequence with respect to the expression product, but 
not with respect to actual probe sequences. 

As to amino acid sequences, one of skill will recognize that individual 
substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 

10 sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 
Conservative substitution tables providing functionally similar amino acids are well known in 
the art. Such conservatively modified variants are in addition to and do not exclude 

15 polymorphic variants, interspecies homologs, and alleles of the invention. 

The following eight groups each contain amino acids that are conservative 
substitutions for one another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid 
(E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), 
Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan 

20 (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, 
Proteins (1984)). 

Macromolecular structures such as polypeptide structures can be described in 
terms of various levels of organization. For a general discussion of this organization, see, 
e.g., Alberts et al, Molecular Biology of the Cell (3 Td ed., 1994) and Cantor and Schimmel, 

25 Biophysical Chemistry Part I: The Conformation of Biological Macromolecules (1 980). 
"Primary structure" refers to the amino acid sequence of a particular peptide. "Secondary 
structure" refers to locally ordered, three dimensional structures within a polypeptide. These 
structures are commonly known as domains. Domains are portions of a polypeptide that 
form a compact unit of the polypeptide and are typically 25 to approximately 500 amino 

30 acids long. Typical domains are made up of sections of lesser organization such as stretches 
of p-sheet and a-helices. "Tertiary structure" refers to the complete three dimensional 
structure of a polypeptide monomer. "Quaternary structure" refers to the three dimensional 



10 



WO 02/079492 



PCT/US02/04915 



structure formed, usually by the noncovalent association of independent tertiary units. 
Anisotropic terms are also known as energy terms. 

A "label" or a "detectable moiety" is a composition detectable by 
spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical 
5 means. For example, useful labels include 32 P, fluorescent dyes, electron-dense reagents, 
enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins 
which can be made detectable, e.g., by incorporating a radiolabel into the peptide or used to 
detect antibodies specifically reactive with the peptide. 

An "effector" or "effector moiety" or "effector component" is a molecule that 

10 is bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 
The "effector" can be a variety of molecules including, for example, detection moieties 
including radioactive compounds, fluroescent compounds, an enzyme or substrate, tags such 
as epitope tags, a toxin; a chemotherapeutic agent; a lipase; an antibiotic; or a radioisotope 

1 5 emitting "hard" e.g. , beta radiation. 

A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 
covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 

20 using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, strep tavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a 
nucleic acid capable of binding to a target nucleic acid of complementary sequence through 
one or more types of chemical bonds, usually through complementary base pairing, usually 

25 through hydrogen bond formation. As used herein, a probe may include natural (i.e., A, G, C, 
or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe 
may be joined by a linkage other than a phosphodi ester bond, so long as it does not interfere 
with hybridization. Thus, for example, probes may be peptide nucleic acids in which the 
constituent bases are joined by peptide bonds rather than phosphodiester linkages. It will be 

30 understood by one of skill in the art that probes may bind target sequences lacking complete 
complementarity with the probe sequence depending upon the stringency of the hybridization 
conditions. The probes are preferably directly labeled as with isotopes, chromOphores, 
lumiphores, chromogens, or indirectly labeled such as with biotin to which a streptavidin 
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complex may later bind. By assaying for the presence or absence of the probe, one can detect 
the presence or absence of the select sequence or subsequence. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic 
acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been 
5 modified by the introduction of a heterologous nucleic acid or protein or the alteration of a 
native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, for 
example, recombinant cells express genes that are not found within the native (non- 
recombinant) form of the cell or express native genes that are otherwise abnormally 
expressed, under expressed or not expressed at all. 

1 0 The term "heterologous" when used with reference to portions of a nucleic 

acid indicates that the nucleic acid comprises two or more subsequences that are not found in 
the same relationship to each other in nature. For instance, the nucleic acid is typically 
recombinantly produced, having two or more sequences from unrelated genes arranged to 
make a new functional nucleic acid, e.g., a promoter from one source and a coding region 

15 from another source. Similarly, a heterologous protein indicates that the protein comprises 
two or more subsequences that are not found in the same relationship to each other in nature 
(e.g., a fusion protein). 

A "promoter" is defined as an array of nucleic acid control sequences that 
direct transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic 

20 acid sequences near the start site of transcription, such as, in the case of a polymerase II type 
promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 

25 active under environmental or developmental regulation. The term "operably linked" refers 
to a functional linkage between a nucleic acid expression control sequence (such as a 
promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

30 An "expression vector" is a nucleic acid construct, generated recombinantly or 

synthetically, with a series of specified nucleic acid elements that permit transcription of a 
particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed operably linked to a promoter. 
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The phrase "selectively (or specifically) hybridizes to" refers to the binding, 
duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under 
stringent hybridization conditions when that sequence is present in a complex mixture (e.g., 
total cellular or library DNA or RNA). 
5 The phrase "stringent hybridization conditions" refers to conditions under 

which a probe will hybridize to its target subsequence, typically in a complex mixture of 
nucleic acids, but to no other sequences. Stringent conditions are sequence-dependent and 
will be different in different circumstances. Longer sequences hybridize specifically at 
higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 

10 Tijssen, Techniques in Biochemistry and Molecular Biology— Hybridization with Nucleic 
Probes, "Overview of principles of hybridization and the strategy of nucleic acid assays" 
(1993). Generally, stringent conditions are selected to be about 5-1 0°C lower than the 
thermal melting point (T m ) for the specific sequence at a defined ionic strength pH. The T m is 
the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% 

15 of the probes complementary to the target hybridize to the target sequence at equilibrium (as 
the target sequences are present in excess, at T m , 50% of the probes are occupied at 
equilibrium). Stringent conditions will be those in which the salt concentration is less than 
about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other 
salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 

20 50 nucleotides) and at least about 60°C for long probes (e.g., greater than 50 nucleotides). 
Stringent conditions may also be achieved with the addition of destabilizing agents such as 
formamide. For selective or specific hybridization, a positive signal is at least two times 
background, preferably 10 times background hybridization. Exemplary stringent 
hybridization conditions can be as following: 50% formamide, 5x SSC, and 1% SDS, 

25 incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°C, with wash in 0.2x SSC, and 
0.1%o SDS at 65°C. For PCR, a temperature of about 36°C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32°C and 48°C 
depending on primer length. For high stringency PCR amplification, a temperature of about 
62°C is typical, although high stringency annealing temperatures can range from about 50°C 

30 to about 65°C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90°C - 95°C for 
30 sec - 2 min., an annealing phase lasting 30 sec. - 2 min., and an extension phase of about 
72°C for 1 - 2 min. Protocols and guidelines for low and high stringency amplification 
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reactions are provided, e.g., in Innis et al. (1990) PCR Protocols, A Guide to Methods and 
Applications, Academic Press, Inc. N.Y.). 

Nucleic acids that do not hybridize to each other under stringent conditions are 
still substantially identical if the polypeptides which they encode are substantially identical. 
5 This occurs, for example, when a copy of a nucleic acid is created using the maximum codon 
degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybridize 
under moderately stringent hybridization conditions. Exemplary "moderately stringent 
hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 
1% SDS at 37°C, and a wash in IX SSC at 45°C. A positive hybridization is at least twice 

10 background. Those of ordinary skill will readily recognize that alternative hybridization and 
wash conditions can be utilized to provide conditions of similar stringency. Additional 
guidelines for determining hybridization parameters are provided in numerous reference, e.g., 
and Current Protocols in Molecular Biology, ed. Ausubel, et al 

The phrase "functional effects" in the context of assays for testing compounds 

15 that modulate activity of an angiogenesis protein includes the determination of a parameter 
that is indirectly or directly under the influence of the angiogenesis protein, e.g., a functional, 
physical, or chemical effect, such as the ability to increase or decrease angiogenesis. It 
includes binding activity, the ability of cells to proliferate, expression in cells undergoing 
angiogenesis, and other characteristics of angiogenic cells. "Functional effects" include in 

20 vitro, in vivo, and ex vivo activities. 

By "detennining the functional effect" is meant assaying for a compound that 
increases or decreases a parameter that is indirectly or directly under the influence of an 
angiogenesis protein sequence, e.g., functional, physical and chemical effects. Such 
functional effects can be measured by any means known to those skilled in the art, e.g., 

25 changes in spectroscopic characteristics (e.g., fluorescence, absorbance, refractive index), 
hydrodynamic (e.g., shape), chromatographic, or solubility properties for the protein, 
measuring inducible markers or transcriptional activation of the angiogenesis protein; 
measuring binding activity or binding assays, e.g. binding to antibodies, and measuring 
cellular proliferation, particularly endothelial cell proliferation, cell viability, cell division 

30 especially of endothelial cells, lumen formation and capillary or vessel growth or formation. 
Determination of the functional effect of a compound on angiogenesis can also be performed 
using angiogenesis assays known to those of skill in the art such as an in vitro assays, e.g., in 
vitro endothelial cell tube formation assays, and other assays such as the chick CAM assay, 
the mouse corneal assay, and assays that assess vascularization of an implanted tumor. The 
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functional effects can be evaluated by many means known to those skilled in the art, e.g., 
microscopy for quantitative or qualitative measures of alterations in morphological features, 
e.g., tube or blood vessel formation, measurement of changes in RNA or protein levels for 
angio genesis-associated sequences, measurement of RNA stability, identification of 
5 downstream or reporter gene expression (CAT, luciferase, P-gal, GFP and the like), e.g., via 
chemiluminescence, fluorescence, colorimetric reactions, antibody binding, inducible 
markers, and ligand binding assays. 

"Inhibitors", "activators", and "modulators" of angiogenic polynucleotide and 
polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules 

1 0 identified using in vitro and in vivo assays of angiogenic polynucleotide and polypeptide 
sequences. Inhibitors are compounds that, e.g., bind to, partially or totally block activity, 
decrease, prevent, delay activation, inactivate, desensitize, or down regulate the activity or 
expression of angiogenesis proteins, e.g., antagonists. "Activators" are compounds that 
increase, open, activate, facilitate, enhance activation, sensitize, agonize, or up regulate 

15 angiogenesis protein activity. Inliibitors, activators, or modulators also include genetically 
modified versions of angiogenesis proteins, e.g., versions with altered activity, as well as 
naturally occurring and synthetic ligands, antagonists, agonists, antibodies, small chemical 
molecules and the like. Such assays for inhibitors and activators include, e.g., expressing the 
angiogenic protein in vitro, in cells, or cell membranes, applying putative modulator 

20 compounds, and then determining the functional effects on activity, as described above. 
Activators and inhibitors of angiogenesis can also be identified by incubating angiogenic 
cells with the test compound and determining increases or decreases in the expression of 1 or 
more angiogenesis proteins, e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50 or more angiogenesis 
proteins, such as angiogenesis proteins comprising the sequences set out in Table 8. 

25 Samples or assays comprising angiogenesis proteins that are treated with a 

potential activator, inhibitor, or modulator are compared to control samples without the 
inhibitor, activator, or modulator to examine the extent of inhibition. Control samples 
(untreated with inhibitors) are assigned a relative protein activity value of 100%. Inhibition 
of a polypeptide is achieved when the activity value relative to the control is about 80%, 

30 preferably 50%, more preferably 25-0%. Activation of an angiogenesis polypeptide is 

achieved when the activity value relative to the control (untreated with activators) is 1 10%, 
more preferably 150%, more preferably 200-500% (i.e., two to five fold higher relative to the 
control), more preferably 1000-3000% higher. 
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"Antibody" refers to a polypeptide comprising a framework region from an 
immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 
epsilon, and mu constant region genes,, as well as the myriad immunoglobulin variable region 
5 genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as 
gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD and IgE, respectively. Typically, the antigen-binding region of an antibody 
will be most critical in specificity and affinity of binding. 

An exemplary immunoglobulin (antibody) structural unit comprises a 

10 tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair 
having one "light" (about 25 lcD) and one "heavy" chain (about 50-70 kD). The N-terminus 
of each chain defines a variable region of about 100 to 1 10 or more amino acids primarily 
responsible for antigen recognition. The terms variable light chain (V L ) and variable heavy 
chain (V H ) refer to these light and heavy chains respectively. 

15 Antibodies exist, e.g., as intact immunoglobulins or as a number of well- 

characterized fragments produced by digestion with various peptidases. Thus, for example, 
pepsin digests an antibody below the disulfide linkages in the hinge region to produce F(ab)' 2> 
a dimer of Fab which itself is a light chain joined to V H -C H 1 by a disulfide bond. The F(ab)'2 
may be reduced under mild conditions to break the disulfide linkage in the hinge region, 

20 thereby converting the F(ab)'2 dimer into an Fab' monomer. The Fab' monomer is 

essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 
1993). While various antibody fragments are defined in terms of the digestion of an intact 
antibody, one of skill will appreciate that such fragments may be synthesized de novo either 
chemically or by using recombinant DNA methodology. Thus, the term antibody, as used 

25 herein, also includes antibody fragments either produced by the modification of whole 

antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single 
chain Fv) or those identified using phage display libraries (see, e.g., McCafferty et aL, Nature 
348:552-554(1990)) 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 

30 antibodies, many technique known in the art can be used (see, e.g., Kohler & Milstein, 

Nature 256:495-497 (1975); Kozbor et at, Immunology Today 4: 72 (1983); Cole et al, pp. 
77-96 in Monoclonal Antibodies and Cancer Therapy, AlanR. Liss, Inc. (1985); Coligan, 
Current Protocols in Immunology (1991); Harlow & Lane, Antibodies, A Laboratory Manual 
(1988); and Goding, Monoclonal Antibodies: Principles and Practice (2d ed. 1986)). 
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Techniques for the production of single chain antibodies (U.S. Patent 4,946,778) can be 
adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or 
other organisms such as other mammals, may be used to express humanized antibodies. 
Alternatively, phage display technology can be used to identify antibodies and heteromeric 
5 Fab fragments that specifically bind to selected antigens {see, e.g. , McCafferty et al. , Nature 
348:552-554 (1990); Marks et al., Biotechnology 10:779-783 (1992)). 

A "chimeric antibody" is an antibody molecule in which (a) the constant 
region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site 
(variable region) is linked to a constant region of a different or altered class, effector function 
1 0 and/or species, or an entirely different molecule which confers new properties to the chimeric 
antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable 
region, or a portion thereof, is altered, replaced or exchanged with a variable region having a 
different or altered antigen specificity. 

The detailed description of the invention includes discussion of the following 
1 5 aspects of the invention: Expression of angiogenesis-associated sequences 
Informatics 

Angiogenesis-associated sequences 

Detection of angiogenesis sequence for diagnostic and 
therapeutic applications 

20 Modulators of angiogenesis 

Methods of identifying variant angiogenesis-associated 
sequences 

Administration of pharmaceutical and vaccine compositions 
Kits for use in diagnostic and/or prognostic applications. 

25 Expression of angiogenesis-associated sequences 

In one aspect, the expression levels of genes are determined in different 
patient samples for which diagnosis information is desired, to provide expression profiles. 
An expression profile of a particular sample is essentially a "fingerprint" of the state of the 
sample; while two states may have any particular gene similarly expressed, the evaluation of 

30 a number of genes simultaneously allows the generation of a gene expression profile that is 
unique to the state of the cell. That is, normal tissue may be distinguished from AD tissue. 
By comparing expression profiles of tissue in known different angiogenesis states, 
information regarding which genes are important (including both up- and down-regulation of 
genes) in each of these states is obtained. The identification of sequences that are 
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differentially expressed in angiogenic versus non-angiogenic tissue allows the use of this 
information in a number of ways. For example, a particular treatment regime may be 
evaluated: does a chemotherapeutic drug act to down-regulate angiogenesis, and thus tumor 
growth or recurrence, in a particular patient. Similarly, diagnosis and treatment outcomes 
5 may be done or confirmed by comparing patient samples with the known expression profdes. 
Angiogenic tissue can also be analyzed to determine the stage of angiogenesis in the tissue. 
Furthermore, these gene expression profiles (or individual genes) allow screening of drug 
candidates with an eye to mimicking or altering a particular expression profile; for example, 
screening can be done for drugs that suppress the angiogenic expression profile. This may be 

10 done by making biochips comprising sets of the important angiogenesis genes, which can 
then be used in these screens. These methods can also be done on the protein basis; that is, 
protein expression levels of the angiogenic proteins can be evaluated for diagnostic purposes 
or to screen candidate agents. In addition, the angiogenic nucleic acid sequences can be 
administered for gene therapy purposes, including the administration of antisense nucleic 

1 5 acids, or the angiogenic proteins (including antibodies and other modulators thereof) 
administered as therapeutic drugs. 

Thus the present invention provides nucleic acid and protein sequences that 
are differentially expressed in angiogenesis, herein termed "angiogenesis sequences". As 
outlined below, angiogenesis sequences include those that are up-regulated (i.e. expressed at 

20 a higher level) in disorders associated with angiogenesis, as well as those that are down- 
regulated (i.e. expressed at a lower level). In a preferred embodiment, the angiogenesis 
sequences are from humans; however, as will be appreciated by those in the art, angiogenesis 
sequences from other organisms may be useful in animal models of disease and drug 
evaluation; thus, other angiogenesis sequences are provided, from vertebrates, including 

25 mammals, including rodents (rats, mice, hamsters, guinea pigs, etc.), primates, farm animals 
(including sheep, goats, pigs, cows, horses, etc). Angiogenesis sequences from other 
organisms may be obtained using the techniques outlined below. 

Angiogenesis sequences can include both nucleic acid and amino acid 
sequences. In a preferred embodiment, the angiogenesis sequences are recombinant nucleic 

30 acids. By the term "recombinant nucleic acid" herein is meant nucleic acid, originally formed 
in vitro, in general, by the manipulation of nucleic acid e.g., using polymerases and 
endonucleases, in a form not normally found in nature. Thus an isolated nucleic acid, in a 
linear form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
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understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e. using the in vivo cellular machinery of the 
host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
5 recombinant for the purposes of the invention. 

Similarly, a "recombinant protein" is a protein made using recombinant 
techniques, i.e. through the expression of a recombinant nucleic acid as depicted above. A 
recombinant protein is distinguished from naturally occurring protein by at least one or more 
characteristics. For example, the protein may be isolated or purified away from some or all 

1 0 of the proteins and compounds with which it is normally associated in its wild type host, and 
thus may be substantially pure. For example, an isolated protein is unaccompanied by at least 
some of the material with which it is normally associated in its natural state, preferably 
constituting at least about 0.5%, more preferably at least about 5% by weight of the total 
protein in a given sample. A substantially pure protein comprises at least about 75% by 

15 weight of the total protein, with at least about 80% being preferred, and at least about 90% 
being particularly preferred. The definition includes the production of an angiogenesis protein 
from one organism in a different organism or host cell. Alternatively, the protein may be 
made at a significantly higher concentration than is normally seen, through the use of an 
inducible promoter or high expression promoter, such that the protein is made at increased 

20 concentration levels. Alternatively, the protein may be in a form not normally found in 
nature, as in the addition of an epitope tag or amino acid substitutions, insertions and 
deletions, as discussed below. 

In a preferred embodiment, the angiogenesis sequences are nucleic acids. As 
will be appreciated by those in the art and is more fully outlined below, angiogenesis 

25 sequences are useful in a variety of applications, including diagnostic applications, which will 
detect naturally occurring nucleic acids, as well as screening applications; for example, 
biochips comprising nucleic acid probes to the angiogenesis sequences can be generated. In 
the broadest sense, then, by "nucleic acid" or "oligonucleotide" or grammatical equivalents 
herein means at least two nucleotides covalently linked together. A nucleic acid of the 

30 present invention will generally contain phosphodiester bonds, although in some cases, 
nucleic acid analogs are included that may have alternate backbones, comprising, for 
example, phosphoramidate, phosphorothioate, phosphorodithioate, or O- 
methylphophoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical 
Approach, Oxford University Press); and peptide nucleic acid backbones and linkages. Other 
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analog nucleic acids include those with positive backbones; non-ionic backbones, and non- 
ribose backbones, including those described in U.S. Patent Nos. 5,235,033 and 5,034,506, 
and Chapters 6 and 7, ASC Symposium Series 580, "Carbohydrate Modifications in 
Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook. Nucleic acids containing one or 
5 more carbocyclic sugars are also included within one definition of nucleic acids. 

Modifications of the ribose-phosphate backbone may be done for a variety of reasons, for 
example to increase the stability and half-life of such molecules in physiological 
environments or as probes on a biochip. 

As will be appreciated by those in the art, nucleic acid analogs may find use in 

10 the present invention. In addition, mixtures of naturally occurring nucleic acids and analogs 
can be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of 
naturally occurring nucleic acids and analogs may be made. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide 
nucleic acid analogs. These backbones are substantially non-ionic under neutral conditions, in 

1 5 contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 
kinetics. PNAs have larger changes in the melting temperature (Tm) for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4°C drop in T m for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9°C. Similarly, 

20 due to their non-ionic nature, hybridization of the bases attached to these backbones is 

relatively insensitive to salt concentration. In addition, PNAs are not degraded by cellular 
enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or 
contain portions of both double stranded or single stranded sequence. As will be appreciated 

25 by those in the art, the depiction of a single strand also defines the sequence of the 

complementary strand; thus the sequences described herein also provide the complement of 
the sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, 
where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and 
combinations of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, 

30 xanthine hypoxanthine, isocytosine, isoguanine, etc. As used herein, the term "nucleoside" 
includes nucleotides and nucleoside and nucleotide analogs, and modified nucleosides such 
as amino modified nucleosides. In addition, "nucleoside" includes non-naturally occurring 
analog structures. Thus for example the individual units of a peptide nucleic acid, each 
containing a base, are referred to herein as a nucleoside. 
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An angiogenesis sequence can be initially identified by substantial nucleic 
acid and/or amino acid sequence homology to the angiogenesis sequences outlined herein. 
Such homology can be based upon the overall nucleic acid or amino acid sequence, and is 
generally determined as outlined below, using either homology programs or hybridization 
5 conditions. 

For identifying angiogenesis-associated sequences, the angiogenesis screen 
typically includes comparing genes identified in a modification of an in vitro model of 
angiogenesis as described in Hiraoka, Cell 95:365 (1998) with genes identified in controls. 
Samples of normal tissue and tissue undergoing angiogenesis are applied to biochips 

10 comprising nucleic acid probes. The samples are first microdissected, if applicable, and 
treated as is known in the art for the preparation of mRNA. Suitable biochips are 
commercially available, for example from Affymetrix. Gene expression profiles as described 
herein are generated and the data analyzed. 

In a preferred embodiment, the genes showing changes in expression as 

15 between normal and disease states are compared to genes expressed in other normal tissues, 
including, but not limited to lung, heart, brain, liver, breast, kidney, muscle, prostate, small 
intestine, large intestine, spleen, bone and placenta. In a preferred embodiment, those genes 
identified during the angiogenesis screen that are expressed in any significant amount in other 
tissues are removed from the profile, although in some embodiments, this is not necessary. 

20 That is, when screening for drugs, it is usually preferable that the target be disease specific, to 
minimize possible side effects. 

In a preferred embodiment, angiogenesis sequences are those that are up- 
regulated in angiogenesis disorders; that is, the expression of these genes is higher in the 
disease tissue as compared to normal tissue. "Up-regulation" as used herein means at least 

25 about a two-fold change, preferably at least about a three fold change, with at least about 
five-fold or higher being preferred. All accession numbers herein are for the GenBank 
sequence database and the sequences of the accession numbers are hereby expressly 
incorporated by reference. GenBank is known in the art, see, e.g., Benson, DA, et al., 
Nucleic Acids Research 26:1-7 (1998) and http://www.ncbi.nlm.nih.gov/. Sequences are also 

30 avialable in other databases, e.g., European Molecular Biology Laboratory (EMBL) and 
DNA Database of Japan (DDBJ). In addition, most preferred genes were found to be 
expressed in a limited amount or not at all in heart, brain, lung, liver, breast, kidney, prostate, 
small intestine and spleen. 
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In another preferred embodiment, angiogenesis sequences are those that are 
down-regulated in the angiogenesis disorder; that is, the expression of these genes is lower in 
angiogenic tissue as compared to normal tissue. "Down-regulation" as used herein means at 
least about a two-fold change, preferably at least about a three fold change, with at least about 
5 five-fold or higher being preferred. 

Angiogenesis sequences according to the invention may be classified into 
discrete clusters of sequences based on common expression profiles of the sequences. 
Expression levels of angiogenesis sequences may increase or decrease as a function of time in 
a manner that correlates with the induction of angiogenesis. Alternatively, expression levels 

10 of angiogenesis sequences may both increase and decrease as a function of time. For 
example, expression levels of some angiogenesis sequences are temporarily induced or 
diminished during the switch to the angiogenesis phenotype, followed by a return to baseline 
expression levels. Tables 1-8 provides genes, the mRNA expression of which varies as a 
function of time in angiogenesis tissue when compared to normal tissue. 

15 hi a particularly preferred embodiment, angiogenesis sequences are those that 

are induced for a period of time, typically by positive angiogenic factors, followed by a return 
to the baseline levels. Sequences that are temporarily induced provide a means to target 
angiogenesis tissue, for example neovascularized tumors, at a particular stage of 
angiogenesis, while avoiding rapidly growing tissue that require perpetual vascularization. 

20 Such positive angiogenic factors include ccFGF, PFGF, VEGF, angiogenin and the like. 

Induced angiogenesis sequences also are further categorized with respect to 
the timing of induction. For example, some angiogenesis genes may be induced at an early 
time period, such as within 10 minutes of the induction of angiogenesis. Others may be 
induced later, such as between 5 and 60 minutes, while yet others may be induced for a time 

25 period of about two hours or more followed by a return to baseline expression levels. 

In another preferred embodiment are angiogenesis sequences that are inhibited 
or reduced as a function of time followed by a return to "normal" expression levels. 
Inhibitors of angiogenesis are examples of molecules that have this expression profile. These 
sequences also can be further divided into groups depending on the timing of diminished 

30 expression. For example, some molecules may display reduced expression within 10 minutes 
of the induction of angiogenesis. Others may be diminished later, such as between 5 and 60 
minutes, while others may be diminished for a time period of about two hours or more 
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followed by a return to baseline. Examples of such negative angiogenic factors include 
thrombospondin and endostatin to name a few. 

In yet another preferred embodiment are angiogenesis sequences that are 
induced for prolonged periods. These sequences are typically associated with induction of 
5 angiogenesis and may participate in induction and/or maintenance of the angiogenesis 
phenotype. 

In another preferred embodiment are angiogenesis sequences, the expression 
of which is reduced or diminished for prolonged periods in angiogenic tissue. These 
sequences are typically angiogenesis inhibitors and their diminution is correlated with an 
1 0 increase in angiogenesis . 

Informatics 

The ability to identify genes that undergo changes in expression with time 
during angiogenesis can additionally provide high-resolution, high-sensitivity datasets which 

15 can be used in the areas of diagnostics, therapeutics, drug development, biosensor 

development, and other related areas. For example, the expression profiles can be used in 
diagnostic or prognostic evaluation of patients with angiogenesis-associated disease. Or as 
another example, subcellular toxicological information can be generated to better direct drug 
structure and activity correlation {see, Anderson, L., "Pharmaceutical Proteomics: Targets, 

20 Mechanism, and Function," paper presented at the H3C Proteomics conference, Coronado, 
CA (June 1 1-12, 1998)). Subcellular toxicological infonnation can also be utilized in a 
biological sensor device to predict the likely toxicological effect of chemical exposures and 
likely tolerable exposure thresholds {see, U.S. Patent No. 5,81 1,231). Similar advantages 
accrue from datasets relevant to other biomolecules and bioactive agents {e.g., nucleic acids, 

25 saccharides, lipids, drugs, and the like). 

Thus, in another embodiment, the present invention provides a database that 
includes at least one set of data assay data. The data contained in the database is acquired , 
e.g., using array analysis either singly or in a library format. The database can be in 
substantially any form in which data can be maintained and transmitted, but is preferably an 

30 electronic database. The electronic database of the invention can be maintained on any 
electronic device allowing for the storage of and access to the database, such as a personal 
computer, but is preferably distributed on a wide area network, such as the World Wide Web. 
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The focus of the present section on databases that include peptide sequence 
data is for clarity of illustration only. It will be apparent to those of skill in the art that similar 
databases can be assembled for any assay data acquired using an assay of the invention. 

The compositions and methods for identifying and/or quantitating the relative 
5 and/or absolute abundance of a variety of molecular and macromolecular species from a 
biological sample undergoing angiogenesis, i.e., the identification of angiogenesis-associated 
sequences described herein, provide an abundance of information, which can be correlated 
with pathological conditions, predisposition to disease, drug testing, therapeutic monitoring, 
gene-disease causal linkages, identification of correlates of immunity and physiological 

10 status, among others. Although the data generated from the assays of the invention is suited 
for manual review and analysis, in a preferred embodiment, prior data processing using high- 
speed computers is utilized. 

An array of methods for indexing and retrieving biomolecular information is 
known in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational 

1 5 database system for storing biomolecular sequence information in a manner that allows 
sequences to be catalogued and searched according to one or more protein function 
hierarchies. U.S. Patent 5,953,727 discloses a relational database having sequence records 
containing information in a format that allows a collection of partial-length DNA sequences 
to be catalogued and searched according to association with one or more sequencing projects 

20 for obtaining full-length sequences from the collection of partial length sequences. U.S. 
Patent 5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 
sequence similar to a sequence data item in a gene database based on the degree of similarity 
between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 

25 in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 
dimensional database comprising a functionality for multi-dimensional data analysis 
described as on-line analytical processing (OLAP), which entails the consolidation of 
projected and actual data according to more than one consolidation path or dimension. U.S. 

30 P atent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 
fields stored in a hierarchical topological map which can be viewed as a tree structure or as 
the merger of two or more such tree structures. 
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The present invention provides a computer database comprising a computer 
and software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 
with data specifying the source of the target-containing sample from which each sequence 
specificity record was obtained. 
5 In an exemplary embodiment, at least one of the sources of target-containing 

sample is from a control tissue sample known to be free of pathological disorders. In a 
variation, at least one of the sources is a known pathological tissue specimen, e.g., a 
neoplastic lesion or another tissue specimen to be analyzed for angiogenesis. In another 
variation, the assay records cross-tabulate one or more of the following parameters for each 

10 target species in a sample: (1) a unique identification code, which can include, e.g., a target 
molecular structure and/or characteristic separation coordinate (e.g., electrophoretic 
coordinates); (2) sample source; and (3) absolute and/or relative quantity of the target species 
present in the sample. 

The invention also provides for the storage and retrieval of a collection of 

1 5 target data in a computer data storage apparatus, which can include magnetic disks, optical 
disks, magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, 
magnetic bubble memory devices, and other data storage devices, including CPU registers 
and on-CPU data storage arrays. Typically, the target data records are stored as a bit pattern 
in an array of magnetic domains on a magnetizable medium or as an array of charge states or 

20 transistor gate states, such as an array of cells in a DRAM device (e.g., each cell comprised of 
a transistor and a charge storage area, which may be on the transistor). In one embodiment, 
the invention provides such storage devices, and computer systems built therewith, 
comprising a bit pattern encoding a protein expression fingerprint record comprising unique 
identifiers for at least 10 target data records cross-tabulated with target source. 

25 When the target is a peptide or nucleic acid, the invention preferably provides 

a method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 
or retrieved from a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 

3 0 embodiment thereof (e.g. , FASTA, TFASTA, GAP, BESTFIT) and/or the comparison may 
be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM- 
compatible (DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format 
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(e.g., Linux, SunOS, Solaris, AIX, SCO Unix, VMS, MV, Macintosh, etc.). floppy diskette or 
hard (fixed, Winchester) disk drive, comprising a bit pattern encoding data from an assay of 
the invention in a file format suitable for retrieval and processing in a computerized sequence 
analysis, comparison, or relative quantitation method. 
5 The invention also provides a network, comprising a plurality of computing 

devices linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, 
ISDN line, wireless network, optical fiber, or other suitable signal tranmission medium, 
whereby at least one network device (e.g., computer, disk array, etc.) comprises a pattern of 
magnetic domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM 

1 0 cells) composing a bit pattern encoding data acquired from an assay of the invention. 

The invention also provides a method for transmitting assay data that includes 
generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 

1 5 database comprising a plurality of assay results obtained by the method of the invention. 

In a preferred embodiment, the invention provides a computer system for 
comparing a query target to a database containing an array of data structures, such as an assay 
result obtained by the method of the invention, and ranking database targets based on the 
degree of identity and gap weight to the target data. A central processor is preferably 

20 initialized to load and execute the computer program for alignment and/or comparison of the 
assay results. Data for a query target is entered into the central processor via an I/O device. 
Execution of the computer program results in the central processor retrieving the assay data 
from the data file, which comprises a binary description of an assay result. 

The target data or record and the computer program can be transferred to 

25 secondary memory, which is typically random access memory (e.g., DRAM, SRAM, 
SGRAM, or SDRAM). Targets are ranked according to the degree of correspondence 
between a selected assay characteristic (e.g., binding to a selected affinity moiety) and the 
same characteristic of the query target and results are output via an I/O device. For example, 
a central processor can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, 

30 PA-8000, SPARC, MPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or 
public domain molecular biology software package (e.g., UWGCG Sequence Analysis 
Software, Darwin); a data file can be an optical or magnetic disk, a data server, a memory 
device (e.g., DRAM, SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, 
etc.); an I/O device can be a terminal comprising a video display and a keyboard, a modem, 
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an ISDN terminal adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or 
other suitable I/O device. 

The invention also preferably provides the use of a computer system, such as 
that described above, which comprises: (1) a computer; (2) a stored bit pattern encoding a 
5 collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer; (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. 



10 Angiogenesis-associated sequences 

Angiogenesis proteins of the present invention may be classified as secreted 
proteins, transmembrane proteins or intracellular proteins. In one embodiment,the 
angiogenesis protein is an intracellular protein. Intracellular proteins may be found in the 
cytoplasm and/or in the nucleus or associated with the intracellular side of the plasma 

15 membrane. Intracellular proteins are involved in all aspects of cellular function and 

replication (including, e.g., signaling pathways); aberrant expression of such proteins often 
results in unregulated or disregulated cellular processes (see, e.g., Molecular Biology of the 
Cell, 3rd Edition, Alberts, Ed., Garland Pub., 1994). For example, many intracellular 
proteins have enzymatic activity such as protein kinase activity, protein phosphatase activity, 

20 protease activity, nucleotide cyclase activity, polymerase activity and the like. Intracellular 
proteins also serve as docking proteins that are involved in organizing complexes of proteins, 
or targeting proteins to various subcellular localizations, and are involved in maintaining the 
structural integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence 

25 in the proteins of one or more motifs for which defined functions have been attributed. In 

addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 

30 domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 
few, have been shown to mediate protein-protein interactions. Some of these may also be 
involved in binding to phospholipids or other second messengers. As will be appreciated by 
one of ordinary skill in the art, these motifs can be identified on the basis of primary 
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sequence; thus, an analysis of the sequence of proteins may provide insight into both the 
enzymatic potential of the molecule and/or molecules with which the protein may associate. 

In another embodiment, the angiogenesis sequences are transmembrane 
proteins. Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. 
5 They may have an intracellular domain, an extracellular domain, or both. The intracellular 
domains of such proteins may have a number of functions including those already described 
for intracellular proteins. For example, the intracellular domain may have enzymatic activity 
and/or may serve as a binding site for additional proteins. Frequently the intracellular 
domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 

10 kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

Transmembrane proteins may contain from one to many transmembrane 
domains. For example, receptor tyrosine kinases, certain cytokine receptors, receptor 

1 5 guanylyl cyclases and receptor serine/threonine protein kinases contain a single 

transmembrane domain. However, various other proteins including channels and adenylyl 
cyclases contain numerous transmembrane domains. Many important cell surface receptors 
such as G protein coupled receptors (GPCRs) are classified as "seven transmembrane 
domain" proteins, as they contain 7 membrane spanning regions. Characteristics of 

20 transmembrane domains include approximately 20 consecutive hydrophobic amino acids that 
may be followed or flanked by charged amino acids. Therefore, upon analysis of the amino 
acid sequence of a particular protein, the localization and number of transmembrane domains 
within the protein may be predicted (see, e.g. PSORT web site http://psort.nibb.ac.jp/). 

The extracellular domains of transmembrane proteins are diverse; however, 

25 conserved motifs are found repeatedly among various extracellular domains. Conserved 
structure and/or functions have been ascribed to different extracellular motifs. Many 
extracellular domains are involved in binding to other molecules. In one aspect, extracellular 
domains are found on receptors. Factors that bind the receptor domain include circulating 
ligands, which may be peptides, proteins, or small molecules such as adenosine and the like. 

30 For example, growth factors such as EGF, FGF and PDGF are circulating growth factors that 
bind to their cognate receptors to initiate a variety of cellular responses. Other factors include 
cytokines, mitogenic factors, neurotrophic factors and the like. Extracellular domains also 
bind to cell-associated molecules. In this respect, they mediate cell-cell interactions. Cell- 
associated ligands can be tethered to the cell for example via a glycosylphosphatidylinositol 
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(GPI) anchor, or may themselves be transmembrane proteins. Extracellular domains also 
associate with the extracellular matrix and contribute to the maintenance of the cell structure. 

Angiogenesis proteins that are transmembrane are particularly preferred in the 
present invention as they are readily accessible targets for immunotherapeutics, as are 
5 described herein. In addition, as outlined below, transmembrane proteins can be also useful 
in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ. Alternatively, antibodies can also label intracellular proteins, in which case samples are 
typically permeablized to provide acess to intracellular proteins. 

It will also be appreciated by those in the art that a transmembrane protein can 

10 be made soluble by removing transmembrane sequences, for example through recombinant 
methods. Furthermore, transmembrane proteins that have been made soluble can be made to 
be secreted through recombinant means by adding an appropriate signal sequence. 

In another embodiment, the angiogenesis proteins are secreted proteins; the 
secretion of which can be either constitutive or regulated. These proteins have a signal 

1 5 peptide or signal sequence that targets the molecule to the secretory pathway. Secreted 

proteins are involved in numerous physiological events; by virtue of their circulating nature, 
they serve to transmit signals to various other cell types. The secreted protein may function in 
an autocrine manner (acting on the cell that secreted the factor), a paracrine manner (acting 
on cells in close proximity to the cell that secreted the factor) or an endocrine manner (acting 

20 on cells at a distance). Thus secreted molecules find use in modulating or altering numerous 
aspects of physiology. Angiogenesis proteins that are secreted proteins are particularly 
preferred in the present invention as they serve as good targets for diagnostic markers, e.g., 
for blood or serum tests. 

An angiogenesis sequence is typically initially identified by substantial nucleic 

25 acid and/or amino acid sequence homology or linkage to the angiogenesis sequences outlined 
herein. Such homology can be based upon the overall nucleic acid or amino acid sequence, 
and is generally determined as outlined below, using either homology programs or 
hybridization conditions. Typically, linked sequences on a mRNA are found on the same 
molecule. 

30 As detailed in the definitions, percent identity can be determined using an 

algorithm such as BLAST. A preferred method utilizes the BLASTN module of WU- 
BLAST-2 set to the default parameters, with overlap span and overlap fraction set to 1 and 
0.125, respectively. The alignment may include the introduction of gaps in the sequences to 
be aligned. In addition, for sequences which contain either more or fewer nucleotides than 
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those of the nucleic acids of the figures, it is understood that the percentage of homology will 
be determined based on the number of homologous nucleosides in relation to the total number 
of nucleosides. Thus, for example, homology of sequences shorter than those of the 
sequences identified herein and as discussed below, will be determined using the number of 
5 nucleosides in the shorter sequence. 

In one embodiment, the nucleic acid homology is determined through 
hybridization studies. Thus, e.g., nucleic acids which hybridize under high stringency to a 
nucleic acid of Tables 1-8 , or its complement, or is also found on naturally occurring 
mRNAs is considered an angiogenesis sequence. In another embodiment, less stringent 

10 hybridization conditions are used; for example, moderate or low stringency conditions may 
be used, as are known in the art; see Ausubel, supra, and Tijssen, supra. 

In addition, the angiogenesis nucleic acid sequences of the invention, e.g, the 
sequence in Tables 1-8 , are fragments of larger genes, i.e. they are nucleic acid segments. 
"Genes" in this context includes coding regions, non-coding regions, and mixtures of coding 

1 5 and non-coding regions. Accordingly, as will be appreciated by those in the art, using the 
sequences provided herein, extended sequences, in either direction, of the angiogenesis genes 
can be obtained, using techniques well known in the art for cloning either longer sequences or 
the full length sequences; see Ausubel, et al, supra. Much can be done by informatics and 
many sequences can be clustered to include multiple sequences, e.g., systems such as 

20 UniGene (see, http://www.ncbi.nlm.nih.gov/UniGene/). 

Once the angiogenesis nucleic acid is identified, it can be cloned and, if 
necessary, its constituent parts recombined to form the entire angiogenesis nucleic acid 
coding regions or the entire mRNA sequence. Once isolated from its natural source, e.g., 
contained within a plasmid or other vector or excised therefrom as a linear nucleic acid 

25 segment, the recombinant angiogenesis nucleic acid can be further-used as a probe to identify 
and isolate other angiogenesis nucleic acids, for example extended coding regions. It can 
also be used as a "precursor" nucleic acid to make modified or variant angiogenesis nucleic 
acids and proteins. 

The angiogenesis nucleic acids of the present invention are used in several 
30 ways. In a first embodiment, nucleic acid probes to the angiogenesis nucleic acids are made 
and attached to biochips to be used in screening and diagnostic methods, as outlined below, 
or for administration, for example for gene therapy, vaccine, and/or antisense applications. 
Alternatively, the angiogenesis nucleic acids that include coding regions of angiogenesis 
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proteins can be put into expression vectors for the expression of angiogenesis proteins, again 
for screening purposes or for administration to a patient. 

In a preferred embodiment, nucleic acid probes to angiogenesis nucleic acids 
(both the nucleic acid sequences outlined in the figures and/or the complements thereof) are 
5 made. The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the angiogenesis nucleic acids, i.e. the target sequence (either the target 
sequence of the sample or to other probe sequences, for example in sandwich assays), such 
that hybridization of the target sequence and the probes of the present invention occurs. As 
outlined below, this complementarity need not be perfect; there may be any number of base 

10 pair mismatches which will interfere with hybridization between the target sequence and the 
single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 
conditions, the sequence is not a complementary target sequence. Thus, by "substantially 
complementary" herein is meant that the probes are sufficiently complementary to the target 

15 sequences to hybridize under normal reaction conditions, particularly high stringency 
conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single 
and partially double stranded. The strandedness of the probe is dictated by the structure, 
composition, and properties of the target sequence. In general, the nucleic acid probes range 

20 from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 
and from about 30 to about 50 bases being particularly preferred. That is, generally whole 
genes are not used. In some embodiments, much longer nucleic acids can be used, up to 
hundreds of bases. 

In a preferred embodiment, more than one probe per sequence is used, with 
25 either overlapping probes or probes to different sections of the target being used. That is, 

two, three, four or more probes, with three being preferred, are used to build in a redundancy 
for a particular target. The probes can be overlapping {i.e. have some sequence in common), 
or separate. In some cases, PCR primers may be used to amplify signal for higher sensitivity. 

As will be appreciated by those in the art, nucleic acids can be attached or 
30 immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 
equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 
removal as outlined below. The binding can typically be covalent or non-covalent. By "non- 
covalent binding" and grammatical equivalents herein is meant one or more of electrostatic, 



31 



WO 02/079492 PCT/US02/04915 

hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent 
attachment of a molecule, such as, streptavidin to the support and the non-covalent binding of 
the biotinylated probe to the streptavidin. By "covalent binding" and grammatical 
equivalents herein is meant that the two moieties, the solid support and the probe, are 
5 attached by at least one bond, including sigma bonds, pi bonds and coordination bonds. 
Covalent bonds can be formed directly between the probe and the solid support or can be 
formed by a cross linker or by inclusion of a specific reactive group on either the solid 
support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 
1 0 In general, the probes are attached to the biochip in a wide variety of ways, as 

will be appreciated by those in the art. As described herein, the nucleic acids can either be 
synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid 

1 5 support" or other grammatical equivalents herein is meant a material that can be modified to 
contain discrete individual sites appropriate for the attachment or association of the nucleic 
acid probes and is amenable to at least one detection method. As will be appreciated by those 
in the art, the number of possible substrates are very large, and include, but are not limited to, 
glass and modified or functionalized glass, plastics (including acrylics, polystyrene and 

20 copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, 

polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica- 
based materials including silicon and modified silicon, carbon, metals, inorganic glasses, 
plastics, etc. In general, the substrates allow optical detection and do not appreciably 
fluorescese. A preferred substrate is described in copending application entitled Reusable 

25 Low Fluorescent Plastic Biochip, U.S. Application Serial No. 09/270,214, filed March 15, 
1999, herein incorporated by reference in its entirety. 

Generally the substrate is planar, although as will be appreciated by those in 
the art, other configurations of substrates may be used as well. For example, the probes may 
be placed on the inside surface of a tube, for flow-through sample analysis to minimize 

30 sample volume. Similarly, the substrate may be flexible, such as a flexible foam, including 
closed cell foams made of particular plastics. 

In a preferred embodiment, the surface of the biochip and the probe may be 
derivatized with chemical functional groups for subsequent attachment of the two. Thus, for 
example, the biochip is derivatized with a chemical functional group including, but not 
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limited to, amino groups, carboxy groups, oxo groups and thiol groups, with amino groups 
being particularly preferred. Using these functional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, for example using linkers as are known in the 
5 art; for example, homo-or hetero-bifunctional linkers as are well known (see 1994 Pierce 
Chemical Company catalog, technical section on cross-linkers, pages 155-200, incorporated 
herein by reference). In addition, in some cases, additional linkers, such as alkyl groups 
(including substituted and heteroalkyl groups) may be used. 

In this embodiment, oligonucleotides are synthesized as is known in the art, 

1 0 and then attached to the surface of the solid support. As will be appreciated by those skilled 
in the art, either the 5' or 3' terminus may be attached to the solid support, or attachment may 
be via an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very 
strong, yet non-covalent. For example, biotinylated oligonucleotides can be made, which 

1 5 bind to surfaces covalently coated with streptavidin, resulting in attachment. 

Alternatively, the oligonucleotides may be synthesized on the surface, as is 
known in the art. For example, photoactivation techniques utilizing photopolymerization 
compounds and techniques are used. In a preferred embodiment, the nucleic acids can be 
synthesized in situ, using well known photolithographic techniques, such as those described 

20 in WO 95/25 1 16; WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references 
cited within, all of which are expressly incorporated by reference; these methods of 
attachment form the basis of the Affrmetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression 
level of angiogenesis-associated sequences. These assays are typically performed in 

25 conjunction with reverse transcription. In such assays, an angiogenesis-associated nucleic 
acid sequence acts as a template in an amplification reaction (e.g., Polymerase Chain 
Reaction, or PGR). In a quantitative amplification, the amount of amplification product will 
be proportional to the amount of template in the original sample. Comparison to appropriate 
controls provides a measure of the amount of angiogenesis-associated RNA. Methods of 

30 quantitative amplification are well known to those of skill in the art. Detailed protocols for 
quantitative PGR are provided, e.g., in Innis et al. (1990) PCR Protocols, A Guide to Methods 
and Applications, Academic Press, Inc. N.Y.). 

In some embodiments, a TaqMan based assay is used to measure expression. 
TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 
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dye and a 3 ' quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3' end. When the PCR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5' fluorescent dye and the 3' 
5 quenching agent, thereby resulting in an increase in fluorescence as a function of 

amplification (see, for example, literature provided by Perkin-Elmer, e.g., www2.perkin- 
elmer.com). 

Other suitable amplification methods include, but are not limited to, ligase 
chain reaction (LCR) (see, Wu and Wallace (1989) Genomics 4: 560, Landegren et al. (1988) 
10 Science 241 : 1077, and Barringer et al. (1990) Gene 89: 1 17), transcription amplification 

(Kwoh et al. (1989) Proc. Natl. Acad. Set USA 86: 1173), self-sustained sequence replication 
(Guatelli et al. (1990) Proc. Nat. Acad. Set USA 87: 1874), dot PCR, and linker adapter PCR, 
etc. 

In a preferred embodiment, angiogenesis nucleic acids, e.g., encoding 

15 angiogenesis proteins are used to make a variety of expression vectors to express 

angiogenesis proteins which can then be used in screening assays, as described below. 
Expression vectors and recombinant DNA technology are well known to those of skill in the 
art (see, e.g., Ausubel, supra, and Gene Expression Systems, Fernandez & Hoeffler, Eds, 
Academic Press, 1999) and are used to express proteins. The expression vectors may be 

20 either self-replicating extrachromosomal vectors or vectors which integrate into a host 
genome. Generally, these expression vectors include transcriptional and translational 
regulatory nucleic acid operably linked to the nucleic acid encoding the angiogenesis protein. 
The term "control sequences" refers to DNA sequences used for the expression of an 
operably linked coding sequence in a particular host organism. Control sequences that are 

25 suitable for prokaryotes, for example, include a promoter, optionally an operator sequence, 
and a ribosome binding site. Eukaryotic cells are known to utilize promoters, 
polyadenylation signals, and enhancers. 

Nucleic acid is "operably linked" when it is placed into a functional 
relationship with another nucleic acid sequence. For example, DNA for a presequence or 

30 secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein 
that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked 
to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site 
is operably linked to a coding sequence if it is positioned so as to facilitate translation. 
Generally, "operably linked" means that the DNA sequences being linked are contiguous, 
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and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers 
do not have to be contiguous. Linking is typically accomplished by ligation at convenient 
restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are 
used in accordance with conventional practice. Transcriptional and translational regulatory 
5 nucleic acid will generally be appropriate to the host cell used to express the angiogenesis 
protein; for example, transcriptional and translational regulatory nucleic acid sequences from 
Bacillus are preferably used to express the angiogenesis protein in Bacillus. Numerous types 
of appropriate expression vectors, and suitable regulatory sequences are known in the art for 
a variety of host cells. 

1 0 In general, transcriptional and translational regulatory sequences may include, 

but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and 
stop sequences, translational start and stop sequences, and enhancer or activator sequences. 
In a preferred embodiment, the regulatory sequences include a promoter and transcriptional 
start and stop sequences. 

1 5 Promoter sequences encode either constitutive or inducible promoters. The 

promoters may be either naturally occurring promoters or hybrid promoters. Hybrid 
promoters, which combine elements of more than one promoter, are also known in the art, 
and are useful in the present invention. 

In addition, an expression vector may comprise additional elements. For 

20 example, the expression vector may have two replication systems, thus allowing it to be 

maintained in two organisms, for example in mammalian or insect cells for expression and in 
a procaryotic host for cloning and amplification. Furthermore, for integrating expression 
vectors, the expression vector contains at least one sequence homologous to the host cell 
genome, and preferably two homologous sequences which flank the expression construct. 

25 The integrating vector may be directed to a specific locus in the host cell by selecting the 
appropriate homologous sequence for inclusion in the vector. Constructs for integrating 
vectors are well known in the art {e.g., Fernandez & Hoeffler, supra). See also Kitamura, et 
al. (1995) PNAS 92:9146-9150. 

In addition, in a preferred embodiment, the expression vector contains a 

30 selectable marker gene to allow the selection of transformed host cells. Selection genes are 
well known in the art and will vary with the host cell used. 

The angiogenesis proteins of the present invention are produced by culturing a 
host cell transformed with an expression vector containing nucleic acid encoding an 
angiogenesis protein, under the appropriate conditions to induce or cause expression of the 
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angiogenesis protein. Conditions appropriate for angiogenesis protein expression will vary 
with the choice of the expression vector and the host cell, and will be easily ascertained by 
one skilled in the art through routine experimentation or optimization. For example, the use 
of constitutive promoters in the expression vector will require optimizing the growth and 
5 proliferation of the host cell, while the use of an inducible promoter requires the appropriate 
growth conditions for induction. In addition, in some embodiments, the timing of the harvest 
is important. For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect 

10 and animal cells, including mammalian cells. Of particular interest are Saccharomyces 
cerevisiae and other yeasts, E. coli, Bacillus subtilis, Sf9 cells, CI 29 cells, 293 cells, 
Neurospora, BHK, CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial 
cells), THP1 cells (a macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the angiogenesis proteins are expressed in 

15 mammalian cells. Mammalian expression systems are also known in the art, and include 
retroviral and adenoviral systems. Of particular use as mammalian promoters are the 
promoters from mammalian viral genes, since the viral genes are often highly expressed and 
have a broad host range. Examples include the SV40 early promoter, mouse mammary tumor 
virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, and the 

20 CMV promoter (see, e.g., Fernandez & Hoeffler, supra). Typically, transcription termination 
and polyadenylation sequences recognized by mammalian cells are regulatory regions located 
3' to the translation stop codon and thus, together with the promoter elements, flank the 
coding sequence. Examples of transcription terminator and polyadenlytion signals include 
those derived form SV40. 

25 The methods of introducing exogenous nucleic acid into mammalian hosts, as 

well as other hosts, is well known in the art, and will vary with the host cell used. 
Techniques include dextran-mediated transfection, calcium phosphate precipitation, 
polybrene mediated transfection, protoplast fusion, electroporation, viral infection, 
encapsulation of the polynucleotide(s) in liposomes, and direct microinjection of the DNA 

30 into nuclei. 

In a preferred embodiment, angiogenesis proteins are expressed in bacterial 
systems. Bacterial expression systems are well known in the art. Promoters from 
bacteriophage may also be used and are known in the art. In addition, synthetic promoters 
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and hybrid promoters are also useful; for example, the tac promoter is a hybrid of the trp and 
lac promoter sequences. Furthermore, a bacterial promoter can include naturally occurring 
promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and 
initiate transcription. In addition to a functioning promoter sequence, an efficient ribosome 
5 binding site is desirable. The expression vector may also include a signal peptide sequence 
that provides for secretion of the angiogenesis protein in bacteria. The protein is either 
secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 
between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial 
expression vector may also include a selectable marker gene to allow for the selection of 

1 0 bacterial strains that have been transformed. Suitable selection genes include genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, 
such as those in the histidine, tryptophan and leucine biosynthetic pathways. These 
components are assembled into expression vectors. Expression vectors for bacteria are well 

1 5 known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, 
and Streptococcus lividans, among others {e.g., Fernandez & Hoeffler, supra). The bacterial 
expression vectors are transformed into bacterial host cells using techniques well known in 
the art, such as calcium chloride treatment, electroporation, and others. 

In one embodiment, angiogenesis proteins are produced in insect cells. 

20 Expression vectors for the transformation of insect cells, and in particular, baculovirus-based 
expression vectors, are well known in the art. 

In a preferred embodiment, angiogenesis protein is produced in yeast cells. 
Yeast expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula polymorpha, 

25 Kluyveromyces fragilis and K. lactis, Pichia guillerimondii and P. pastoris, 
Schizosaccharomyces pombe, and Yarrowia lipolytica. 

The angiogenesis protein may also be made as a fusion protein, using 
techniques well known in the art. Thus, for example, for the creation of monoclonal 
antibodies, if the desired epitope is small, the angiogenesis protein may be fused to a carrier 

30 protein to form an immunogen. Alternatively, the angiogenesis protein may be made as a 
fusion protein to increase expression, or for other reasons. For example, when the 
angiogenesis protein is an angiogenesis peptide, the nucleic acid encoding the peptide may be 
linked to another nucleic acid for expression purposes. Fusion with detection epitope tags 
can be made, e.g., with FLAG, His 6, myc, HA, etc. 
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In one embodiment, the angiogenesis nucleic acids, proteins and antibodies of 
the invention are labeled. By "labeled" herein is meant that a compound has at least one 
element, isotope or chemical compound attached to enable the detection of the compound. In 
general, labels fall into three classes: a) isotopic labels, which may be radioactive or heavy 
5 isotopes; b) immune labels, which may be antibodies, antigens, or epitope tags and c) colored 
or fluorescent dyes. The labels may be incorporated into the angiogenesis nucleic acids, 
proteins and antibodies at any position. For example, the label should be capable of 
producing, either directly or indirectly, a detectable signal. The detectable moiety may be a 
radioisotope, such as 3 H, 14 C, 32 P, 35 S, or 125 I, a fluorescent or chemiluminescent compound, 

10 such as fluorescein isothiocyanate, rhodamine, or luciferin, or an enzyme, such as alkaline 
phosphatase, beta-galactosidase or horseradish peroxidase. Any method known in the art for 
conjugating the antibody to the label may be employed, including those methods described by 
Hunter et al., Nature, 144:945 (1962); David et al., Biochemistry, 13:1014 (1974); Pain et 
al., J. Immunol. Meth., 40:219 (1981); andNygren, J. Histochem. and Cytochem., 30:407 

15 (1982). 

Accordingly, the present invention also provides angiogenesis protein 
sequences. An angiogenesis protein of the present invention may be identified in several 
ways. "Protein" in this sense includes proteins, polypeptides, and peptides. As will be 
appreciated by those in the art, the nucleic acid sequences of the invention can be used to 

20 generate protein sequences. There are a variety of ways to do this, including cloning the 
entire gene and verifying its frame and amino acid sequence, or by comparing it to known 
sequences to search for homology to provide a frame, assuming the angiogenesis protein has 
an identifiable motif or homology to some protein in the database being used. Generally, the 
nucleic acid sequences are input into a program that will search all three frames for 

25 homology. This is done in a preferred embodiment using the following NCBI Advanced 

BLAST parameters. The program is blastx or blastn. The database is nr. The input data is as 
"Sequence in FASTA format". The organism list is "none". The "expect" is 10; the filter is 
default. The "descriptions" is 500, the "alignments" is 500, and the "alignment view" is 
pairwise. The "Query Genetic Codes" is standard (1). The matrix is BLOSUM62; gap 

30 existence cost is 1 1 , per residue gap cost is 1; and the lambda ratio is .85 default. This 
results in the generation of a putative protein sequence. 

Also included within one embodiment of angiogenesis proteins are amino acid 
variants of the naturally occurring sequences, as determined herein. Preferably, the variants 
are preferably greater than about 75% homologous to the wild-type sequence, more 
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preferably greater than about 80%, even more preferably greater than about 85% and most 
preferably greater than 90%. In some embodiments the homology will be as high as about 93 
to 95 or 98%. As for nucleic acids, homology in this context means sequence similarity or 
identity, with identity being preferred. This homology will be determined using standard 
5 techniques well known in the art as are outlined above for the nucleic acid homologies. 

Angiogenesis proteins of the present invention may be shorter or longer than 
the wild type amino acid sequences. Thus, in a preferred embodiment, included within the 
definition of angiogenesis proteins are portions or fragments of the wild type sequences, 
herein. In addition, as outlined above, the angiogenesis nucleic acids of the invention may be 

10 used to obtain additional coding regions, and thus additional protein sequence, using 
techniques known in the art. 

In a preferred embodiment, the angiogenesis proteins are derivative or variant 
angiogenesis proteins as compared to the wild-type sequence. That is, as outlined more fully 
below, the derivative angiogenesis peptide will often contain at least one amino acid 

1 5 substitution, deletion or insertion, with amino acid substitutions being particularly preferred. 
The amino acid substitution, insertion or deletion may occur at any residue within the 
angiogenesis peptide. 

Also included within one embodiment of angiogenesis proteins of the present 
invention are amino acid sequence variants. These variants typically fall into one or more of 

20 three classes: substitutional, insertional or deletional variants. These variants ordinarily are 
prepared by site specific mutagenesis of nucleotides in the DNA encoding the angiogenesis 
protein, using cassette or PCR mutagenesis or other techniques well known in the art, to 
produce DNA encoding the variant, and thereafter expressing the DNA in recombinant cell 
culture as outlined above. However, variant angiogenesis protein fragments having up to 

25 about 100-150 residues may be prepared by in vitro synthesis using established techniques. 
Amino acid sequence variants are characterized by the predetermined nature of the variation, 
a feature that sets them apart from naturally occurring allelic or interspecies variation of the 
angiogenesis protein amino acid sequence. The variants typically exhibit the same qualitative 
biological activity as the naturally occurring analogue, although variants can also be selected 

30 which have modified characteristics as will be more fully outlined below. 

While the site or region for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 
conducted at the target codon or region and the expressed angiogenesis variants screened for 
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the optimal combination of desired activity. Techniques for making substitution mutations at 
predetermined sites in DNA having a known sequence are well known, for example, Ml 3 
primer mutagenesis and PCR mutagenesis. Screening of the mutants is done using assays of 
angiogenesis protein activities. 
5 Amino acid substitutions are typically of single residues; insertions usually 

will be on the order of from about 1 to 20 amino acids, although considerably larger 
insertions may be tolerated. Deletions range from about 1 to about 20 residues, although in 
some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to 

10 arrive at a final derivative. Generally these changes are done on a few amino acids to 

minimize the alteration of the molecule. However, larger changes may be tolerated in certain 
circumstances. When small alterations in the characteristics of the angiogenesis protein are 
desired, substitutions are generally made in accordance with the amino acid substitution chart 
provided in the definition section. 

15 Substantial changes in function or immunological identity are made by 

selecting substitutions that are less conservative than those provided in the definition of 
"conservative substitution". For example, substitutions may be made which more 
significantly affect: the structure of the polypeptide backbone in the area of the alteration, for 
example the alpha-helical or beta-sheet structure; the charge or hydrophobicity of the 

20 molecule at the target site; or the bulk of the side chain. The substitutions which in general 
are expected to produce the greatest changes in the polypeptide's properties are those in 
which (a) a hydrophilic residue, e.g. seryl or threonyl, is substituted for (or by) a hydrophobic 
residue, e.g. leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is 
substituted for (or by) any other residue; (c) a residue having an electropositive side chain, 

25 e.g. lysyl, arginyl, or histidyl, is substituted for (or by) an electronegative residue, e.g. 
glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g. phenylalanine, is 
substituted for (or by) one not having a side chain, e.g. glycine. 

The variants typically exhibit the same qualitative biological activity and will 
elicit the same immune response as the naturally-occurring analog, although variants also are 

30 selected to modify the characteristics of the angiogenesis proteins as needed. Alternatively, 
the variant may be designed such that the biological activity of the angiogenesis protein is 
altered. For example, glycosylation sites may be altered or removed. 

Covalent modifications of angiogenesis polypeptides are included within the 
scope of this invention. One type of covalent modification includes reacting targeted amino 
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acid residues of an angiogenesis polypeptide with an organic derivatizing agent that is 
capable of reacting with selected side chains or the N-or C-terminal residues of an 
angiogenesis. polypeptide. Derivatization with bifunctional agents is useful, for instance, for 
crosslinking angiogenesis polypeptides to a water-insoluble support matrix or surface for use 
5 in the method for purifying anti-angiogenesis polypeptide antibodies or screening assays, as 
is more fully described below. Commonly used crosslinking agents include, e.g., 1,1- 
bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, for example, 
esters with 4-azidosalicylic acid, homobifunctional imidoesters, including disuccinimidyl 
esters such as 3,3'-dithiobis(succinimidylpropionate), bifunctional maleimides such as bis-N- 

10 maleimido-l,8-octane and agents such as methyl-3-[(p-azidophenyl)dithio]propioimidate. 

Other modifications include deamidation of glutaminyl and asparaginyl 
residues to the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of 
proline and lysine, phosphorylation of hydroxyl groups of seryl, threonyl or tyrosyl residues, 
methylation of the y-amino groups of lysine, arginine, andhistidine side chains [T.E. 

15 Creighton, Proteins: Structure and Molecular Properties, W.H. Freeman & Co., San 

Francisco, pp. 79-86 (19S3)], acetylation of the N-terminal amine, and amidation of any C- 
terminal carboxyl group. 

Another type of covalent modification of the angiogenesis polypeptide 
included within the scope of this invention comprises altering the native glycosylation pattern 

20 of the polypeptide. "Altering the native glycosylation pattern" is intended for purposes herein 
to mean deleting one or more carbohydrate moieties found in native sequence angiogenesis 
polypeptide, and/or adding one or more glycosylation sites that are not present in the native 
sequence angiogenesis polypeptide. Glycosylation patterns can be altered in many ways. For 
example the use of different cell types to express angiogenesis-associated sequences can 

25 result in different glycosylation patterns. 

Addition of glycosylation sites to angiogenesis polypeptides may also be 
accomplished by altering the amino acid sequence thereof. The alteration may be made, for 
example, by the addition of, or substitution by, one or more serine or threonine residues to the 
native sequence angiogenesis polypeptide (for O-linked glycosylation sites). The 

30 angiogenesis amino acid sequence may optionally be altered through changes at the DNA 

level, particularly by mutating the DNA encoding the angiogenesis polypeptide at preselected 
bases such that codons are generated that will translate into the desired amino acids. 
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Another means of increasing the number of carbohydrate moieties on the 
angiogenesis polypeptide is by chemical or enzymatic coupling of glycosides to the 
polypeptide. Such methods are described in the art, e.g., in WO 87/05330 published 1 1 
September 1987, and in Aplin and Wriston, CRC Crit. Rev. Biochem., pp. 259-306 (1981). 
5 Removal of carbohydrate moieties present on the angiogenesis polypeptide 

may be accomplished chemically or enzymatically or by mutational substitution of codons 
encoding for amino acid residues that serve as targets for glycosylation. Chemical 
deglycosylation techniques are known in the art and described, for instance, by Hakimuddin, 
et al., Arch. Biochem. Biophys., 259:52 (1987) and by Edge et al., Anal. Biochem., 118:131 

10 (1981). Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the 
use of a variety of endo-and exo-glycosidases as described by Thotakura et al., Meth. 
Enzymol., 138:350 (1987). 

Another type of covalent modification of angiogenesis comprises linking the 
angiogenesis polypeptide to one of a variety of nonproteinaceous polymers, e.g., 

1 5 polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the maimer set forth in 
U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192 or 4,179,337. 

Angiogenesis polypeptides of the present invention may also be modified in a 
way to form chimeric molecules comprising an angiogenesis polypeptide fused to another, 
heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 

20 molecule comprises a fusion of an angiogenesis polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the angiogenesis polypeptide. The 
presence of such epitope-tagged forms of an angiogenesis polypeptide can be detected using 
an antibody against the tag polypeptide. Also, provision of the epitope tag enables the 

25 angiogenesis polypeptide to be readily purified by affinity purification using an anti-tag 
antibody or another type of affinity matrix that binds to the epitope tag. In an alternative 
embodiment, the chimeric molecule may comprise a fusion of an angiogenesis polypeptide 
with an immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of 
the chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

30 Various tag polypeptides and their respective antibodies are well known in the 

art. Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; 
HIS6 and metal chelation tags, the flu HA tag polypeptide and its antibody 12CA5 [Field et 
al, Mol. Cell. Biol, 8:2159-2165 (1988)]; the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 
9E10 antibodies thereto [Evan et al, Molecular and Cellular Biology, 5:3610-3616 (1985)]; 
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and the Herpes Simplex virus glycoprotein D (gD) tag and its antibody [Paborsky et al, 
Protein Engineering, 3(6):547-553 (1990)]. Other tag polypeptides include the Flag-peptide 
[Hopp et al, BioTechnology, 6: 1204-1210 (1988)]; the KT3 epitope peptide [Martin et al, 
Science, 255:192-194 (1992)]; tubulin epitope peptide [Skinner et al, J. Biol. Chem., 
5 266:15163-15166 (1991)]; and the T7 gene 10 protein peptide tag [Lutz-Freyermuth et al, 
Proc. Natl Acad. Sci. USA, 87:6393-6397 (1990)]. 

Also included with an embodiment of angiogenesis protein are other 
angiogenesis proteins of the angiogenesis family, and angiogenesis proteins from other 
organisms, which are cloned and expressed as outlined below. Thus, probe or degenerate 

10 polymerase chain reaction (PCR) primer sequences may be used to find other related 

angiogenesis proteins from humans or other organisms. As will be appreciated by those in 
the art, particularly useful probe and/or PCR primer sequences include the unique areas of the 
angiogenesis nucleic acid sequence. As is generally known in the art, preferred PCR primers 
are from about 15 to about 35 nucleotides in length, with from about 20 to about 30 being 

15 preferred, and may contain inosine as needed. The conditions for the PCR reaction are well 
known in the art (e.g., Innis, PCR Protocols, supra). 

In addition, as is outlined herein, angiogenesis proteins can be made that are 
longer than those encoded by the nucleic acids of the figures, e.g., by the elucidation of 
extended sequences, the addition of epitope or purification tags, the addition of other fusion 

20 sequences, etc. 

Angiogenesis proteins may also be identified as being encoded by 
angiogenesis nucleic acids. Thus, angiogenesis proteins are encoded by nucleic acids that 
will hybridize to the sequences of the sequence listings, or their complements, as outlined 
herein. 

25 In a preferred embodiment, when the angiogenesis protein is to be used to 

generate antibodies, e.g., for immunotherapy or immunodiagnosis, the angiogenesis protein 
should share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 

30 made to a smaller angiogenesis protein will be able to bind to the full-length protein, 
particularly linear epitopes. In a preferred embodiment, the epitope is unique; that is, 
antibodies generated to a unique epitope show little or no cross-reactivity. In a preferred 
embodiment, the epitope is selected from a protein sequence set out in Table 8. 
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Methods of preparing polyclonal antibodies are known to the skilled artisan 
{e.g., Coligan, supra; and Harlow & Lane, supra). Polyclonal antibodies can be raised in a 
mammal, e.g., by one or more injections of an immunizing agent and, if desired, an adjuvant. 
Typically, the immunizing agent and/or adjuvant will be injected in the mammal by multiple 
5 subcutaneous or intraperitoneal injections. The immunizing agent may include a protein 
encoded by a nucleic acid of the figures or fragment thereof or a fusion protein thereof. It 
may be useful to conjugate the immunizing agent to a protein known to be immunogenic in 
the mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean 

10 trypsin inhibitor. Examples of adjuvants which may be employed include Freund's complete 
adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose 
dicorynomycolate). The immunization protocol may be selected by one skilled in the art 
without undue experimentation. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal 

1 5 antibodies may be prepared using hybridoma methods, such as those described by Kohler and 
Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, hamster, or other 
appropriate host animal, is typically immunized with an immunizing agent to elicit 
lymphocytes that produce or are capable of producing antibodies that will specifically bind to 
the immunizing agent. Alternatively, the lymphocytes may be immunized in vitro. The 

20 immunizing agent will typically include a polypeptide encoded by a nucleic acid of Tables 1 - 
8 , or fragment thereof, or a fusion protein thereof. Generally, either peripheral blood 
lymphocytes ("PBLs") are used if cells of human origin are desired, or spleen cells or lymph 
node cells are used if non-human mammalian sources are desired. The lymphocytes are then 
fused with an immortalized cell line using a suitable fusing agent, such as polyethylene 

25 glycol, to form a hybridoma cell [Goding, Monoclonal Antibodies: Principles and Practice, 
Academic Press, (1986) pp. 59-103]. Immortalized cell lines are usually transformed 
mammalian cells, particularly myeloma cells of rodent, bovine and human origin. Usually, 
rat or mouse myeloma cell lines are employed. The hybridoma cells may be cultured in a 
suitable culture medium that preferably contains one or more substances that inhibit the 

30 growth or survival of the unfused, immortalized cells. For example, if the parental cells lack 
the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture 
medium for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine 
("HAT medium"), which substances prevent the growth of HGPRT-deficient cells. 
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In one embodiment, the antibodies are bispecific antibodies. Bispecific 
antibodies are monoclonal, preferably human or humanized, antibodies that have binding 
specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 
5 protein encoded by a nucleic acid Tables 1-8 or a fragment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 

In a preferred embodiment, the antibodies to angiogenesis protein are capable 

10 of reducing or eliminating a biological function of an angiogenesis protein, as is described 
below. That is, the addition of anti-angiogenesis protein antibodies (either polyclonal or 
preferably monoclonal) to angiogenic tissue (or cells containing angiogenesis) may reduce or 
eliminate the angiogenesis activity. Generally, at least a 25% decrease in activity is 
preferred, with at least about 50% being particularly preferred and about a 95-100% decrease 

1 5 being especially preferred. 

In a preferred embodiment the antibodies to the angiogenesis proteins are 
humanized antibodies (e.g., Xenerex Biosciences, Mederex, Inc., Abgenix, Inc., Protein 
Design Labs,Inc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric 
molecules of immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, 

20 Fab, Fab', F(ab')2 or other antigen-binding subsequences of antibodies) which contain 

minimal sequence derived from non-human immunoglobulin. Humanized antibodies include 
human immunoglobulins (recipient antibody) in which residues form a complementary 
determining region (CDR) of the recipient are replaced by residues from a CDR of a non- 
human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, 

25 affinity and capacity. In some instances, Fv framework residues of the human 

immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
may also comprise residues which are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, a humanized antibody will comprise 
substantially all of at least one, and typically two, variable domains, in which all or 

30 substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework (FR) regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise 
at least a portion of an immunoglobulin constant region (Fc), typically that of a human 
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immunoglobulin [Jones et al. Nature, 321:522-525 (1986); Riechmann et al., Nature, 
332:323-329 (1988); andPresta, Curr. Op. Struct. Biol., 2:593-596 (1992)]. 

Methods for humanizing non-human antibodies are well known in the art. 
Generally, a humanized antibody has one or more amino acid residues introduced into it from 
5 a source which is non-human. These non-human amino acid residues are often referred to as 
import residues, which are typically taken from an import variable domain. Humanization 
can be essentially performed following the method of Winter and co-workers [Jones et al., 
Nature, 321:522-525 (1986); Riechmann et al., Nature, 332:323-327 (1988); Verhoeyen et 
al, Science, 239:1534-1536 (1988)], by substituting rodent CDRs or CDR sequences for the 

10 corresponding sequences of a human antibody. Accordingly, such humanized antibodies are 
chimeric antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact 
human variable domain has been substituted by the corresponding sequence from a non- 
human species. In practice, humanized antibodies are typically human antibodies in which 
some CDR residues and possibly some FR residues are substituted by residues from 

1 5 analogous sites in rodent antibodies. 

Human antibodies can also be produced using various techniques known in the 
art, including phage display libraries [Hoogenboom and Winter, J. Mol. Biol., 227:381 
(1991); Marks et al, J. Mol. Biol., 222:581 (1991)]. The techniques of Cole et al. and 
Boerner et al. are also available for the preparation of human monoclonal antibodies (Cole et 

20 al., Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, p. 77 (1985) and Boerner et 
al., J. Immunol., 147(l):86-95 (1991)]. Similarly, human antibodies can be made by 
introducing of human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in 

25 humans in all respects, including gene rearrangement, assembly, and antibody repertoire. 
This approach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 
5,569,825; 5,625,126; 5,633,425; 5,661,016, and in the following scientific publications: 
Marks et al, Bio/Technology 10, 779-783 (1992); Lonberg et al., Nature 368 856-859 (1994); 
Morrison, Nature 368, 812-13 (1994); Fishwild et al., Nature Biotechnology 14, 845-51 

30 (1996); Neuberger, Nature Biotechnology 14, 826 (1996); Lonberg and Huszar, Intern. Rev. 
Immunol. 13 65-93 (1995). 

By immunotherapy is meant treatment of angiogenesis with an antibody raised 
against angiogenesis proteins. As used herein, immunotherapy can be passive or active. 
Passive immunotherapy as defined herein is the passive transfer of antibody to a recipient 
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(patient). Active immunization is the induction of antibody and/or T-cell responses in a 
recipient (patient). Induction of an immune response is the result of providing the recipient 
with an antigen to which antibodies are raised. As appreciated by one of ordinary skill in the 
art, the antigen may be provided by injecting a polypeptide against which antibodies are 
5 desired to be raised into a recipient, or contacting the recipient with a nucleic acid capable of 
expressing the antigen and under conditions for expression of the antigen, leading to an 
immune response. 

In a preferred embodiment the angiogenesis proteins against which antibodies 
are raised are secreted proteins as described above. Without being bound by theory, 

1 0 antibodies used for treatment, bind and prevent the secreted protein from binding to its 
receptor, thereby inactivating the secreted angiogenesis protein. 

In another preferred embodiment, the angiogenesis protein to which antibodies 
are raised is a transmembrane protein. Without being bound by theory, antibodies used for 
treatment, bind the extracellular domain of the angiogenesis protein and prevent it from 

15 binding to other proteins, such as circulating ligands or cell-associated molecules. The 

antibody may cause down-regulation of the transmembrane angiogenesis protein. As will be 
appreciated by one of ordinary skill in the art, the antibody may be a competitive, non- 
competitive or uncompetitive inhibitor of protein binding to the extracellular domain of the 
angiogenesis protein. The antibody is also an antagonist of the angiogenesis protein. 

20 Further, the antibody prevents activation of the transmembrane angiogenesis protein. In one 
aspect, when the antibody prevents the binding of other molecules to the angiogenesis 
protein, the antibody prevents growth of the cell. The antibody may also be used to target or 
sensitize the cell to cytotoxic agents, including, but not limited to TNF-a, TNF-p\ IL-1, INF-y 
and IL-2, or chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, 

25 methotrexate, and the like. In some instances the antibody belongs to a sub-type that 
activates serum complement when complexed with the transmembrane protein thereby 
mediating cytotoxicity or antigen-dependent cytotoxicity (ADCC). Thus, angiogenesis is 
treated by administering to a patient antibodies directed against the transmembrane 
angiogenesis protein. Antibody-labeling may activate a co-toxin, localize a toxin payload, or 

30 otherwise provide means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated or fused to an 
effector moiety. The effector moiety can be any number of molecules, including labelling 
moieties such as radioactive labels or fluorescent labels, or can be a therapeutic moiety. In 
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one aspect the therapeutic moiety is a small molecule that modulates the activity of the 
angiogenesis protein. In another aspect the therapeutic moiety modulates the activity of 
molecules associated with or in close proximity to the angiogenesis protein. The therapeutic 
moiety may inhibit enzymatic activity such as protease or collagenase activity associated with 
5 angiogenesis, or be an attractant of other cells, such as NK cells. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic 
agent. In this method, targeting the cytotoxic agent to angiogenesis tissue or cells, results in a 
reduction in the number of afflicted cells, thereby reducing symptoms associated with 
angiogenesis. Cytotoxic agents are numerous and varied and include, but are not limited to, 

10 cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their 

corresponding fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A 
chain, curcin, crotin, phenomycin, enomycin and the like. Cytotoxic agents also include 
radiochemicals made by conjugating radioisotopes to antibodies raised against angiogenesis 
proteins, or binding of a radionuclide to a chelating agent that has been covalently attached to 

15 the antibody. Targeting the therapeutic moiety to transmembrane angiogenesis proteins not 
only serves to increase the local concentration of therapeutic moiety in the angiogenesis 
afflicted area, but also serves to reduce deleterious side effects that may be associated with 
the therapeutic moiety. 

In another preferred embodiment, the angiogenesis protein against which the 

20 antibodies are raised is an intracellular protein. In this case, the antibody may be conjugated 
or fused to a protein which facilitates entry into the cell. In one case, the antibody enters the 
cell by endocytosis. In another embodiment, a nucleic acid encoding the antibody is 
administered to the individual or cell. Moreover, wherein the angiogenesis protein can be 
targeted within a cell, i.e., the nucleus, an antibody thereto contains a signal for that target 

25 localization, i.e., a nuclear localization signal. 

The angiogenesis antibodies of the invention specifically bind to angiogenesis 
proteins. By "specifically bind" herein is meant that the antibodies bind to the protein with a 
Kd of at least about 0.1 mM, more usually at least about 1 uM, preferably at least about 0.1 
uM or better, and most preferably, 0.01 uM or better. Selectivity of binding is also 

30 important. 

In a preferred embodiment, the angiogenesis protein is purified or isolated 
after expression. Angiogenesis proteins may be isolated or purified in a variety of ways 
known to those skilled in the art depending on what other components are present in the 
sample. Standard purification methods include electrophoretic, molecular, immunological 
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and chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the angiogenesis protein 
may be purified using a standard anti-angiogenesis protein antibody column. Ultrafiltration 
and diafiltration techniques, in conjunction with protein concentration, are also useful. For 
5 general guidance in suitable purification techniques, see Scopes, R., Protein Purification, 
Springer- Verlag, NY (1982). The degree of purification necessary will vary depending on 
the use of the angiogenesis protein. In some instances no purification will be necessary. 

Once expressed and purified if necessary, the angiogenesis proteins and 
nucleic acids are useful in a number of applications. They may be used as immunoselection 
10 reagents, as vaccine reagents, as screening agents, etc. 

Detection of angiogenesis sequence for diagnostic and therapeutic applications 

In one aspect, the RNAexpression levels of genes are determined for different 
cellular states in the angiogenesis phenotype. Expression levels of genes in normal tissue 

1 5 (i. e. , not undergoing angiogenesis) and in angiogenesis tissue (and in some cases, for varying 
severities of angiogenesis that relate to prognosis, as outlined below) are evaluated to provide 
expression profiles. An expression profile of a particular cell state or point of development is 
essentially a "fingerprint" of the state. While two states may have any particular gene 
similarly expressed, the evaluation of a number of genes simultaneously allows the 

20 generation of a gene expression profile that is reflective of the state of the cell. By comparing 
expression profiles of cells in different states, information regarding which genes are 
important (including both up- and down-regulation of genes) in each of these states is 
obtained. Then, diagnosis may be performed or confirmed to determine whether a tissue 
sample has the gene expression profile of normal or angiogenesic tissue. This will provide 

25 for molecular diagnosis of related conditions. 

"Differential expression," or grammatical equivalents as used herein, refers to 
qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 
qualitatively have its expression altered, including an activation or inactivation, in, e.g., 

30 normal versus angiogenic tissue. Genes may be turned on or turned off in a particular state, 
relative to another state thus permitting comparison of two or more statese. A qualitatively 
regulated gene will exhibit an expression pattern within a state or cell type which is 
detectable by standard techniques. Some genes will be expressed in one state or cell type, but 
not in both. Alternatively, the difference in expression may be quantitative, e.g., in that 
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expression is increased or decreased; i.e., gene expression is either upregulated, resulting in 
an increased amount of transcript, or downregulated, resulting in a decreased amount of 
transcript. The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Affymetrix 
5 GeneChip™ expression arrays, Lockhart, Nature Biotechnology, 14:1675-1680 (1996), 

hereby expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PCR, Northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is at least 
about 50%, more preferably at least about 100%, more preferably at least about 150%, more 

10 preferably at least about 200%, with from 300 to at least 1000% being especially preferred. 

Evaluation may be at the gene transcript, or the protein level. The amount of 
gene expression may be monitored using nucleic acid probes to the DNA or RNA equivalent 
of the gene transcript, and the quantification of gene expression levels, or, alternatively, the 
final gene product itself (protein) can be monitored, e.g., with antibodies to the angiogenesis 

15 protein and standard immunoassays (ELISAs, etc.) or other techniques, including mass 
spectroscopy assays, 2D gel electrophoresis assays, etc. Proteins corresponding to 
angiogenesis genes, i.e., those identified as being important in an angiogenesis phenotype, 
can be evaluated in an angiogenesis diagnostic test. 

In a preferred embodiment, gene expression monitoring is performed 

20 simultaneously on a number of genes. Multiple protein expression monitoring can be 

performed as well. Similarly, these assays may be performed on an individual basis as well. 

In this embodiment, the angiogenesis nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of angiogenesis sequences in a 
particular cell. The assays are further described below in the example. PCR techniques can 

25 be used to provide greater sensitivity. 

In a preferred embodiment nucleic acids encoding the angiogenesis protein are 
detected. Although DNA or RNA encoding the angiogenesis protein may be detected, of 
particular interest are methods wherein an mRNA encoding an angiogenesis protein is 
detected. Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is 

30 complementary to and hybridizes with the mRNA and includes, but is not limited to, 

oligonucleotides, cDNA or RNA. Probes also should contain a detectable label, as defined 
herein. In one method the mRNA is detected after immobilizing the nucleic acid to be 
examined on a solid support such as nylon membranes and hybridizing the probe with the 
sample. Following washing to remove the non-specifically bound probe, the label is 



50 



WO 02/079492 



PCT/US02/04915 



detected. In another method detection of the mRNA is performed in situ. In this method 
permeabilized cells or tissue samples are contacted with a detectably labeled nucleic acid 
probe for sufficient time to allow the probe to hybridize with the target mRNA. Following 
washing to remove the non-specifically bound probe, the label is detected. For example a 
5 digoxygenin labeled riboprobe (RNA probe) that is complementary to the mRNA encoding 
' an angiogenesis protein is detected by binding the digoxygenin with an anti-digoxygenin 
secondary antibody and developed with nitro blue tetrazolium and 5-bromo-4-chloro-3- 
indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins 
10 as described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
assays. The angiogenesis proteins, antibodies, nucleic acids, modified proteins and cells 
containing angiogenesis sequences are used in diagnostic assays. This can be performed on 
an individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 
1 5 techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

As described and defined herein, angiogenesis proteins, including 
intracellular, transmembrane or secreted proteins, find use as markers of angiogenesis. 
Detection of these proteins in putative angiogenesis tissue allows for detection or diagnosis of 

20 angiogenesis. In one embodiment, antibodies are used to detect angiogenesis proteins. A 
preferred method separates proteins from a sample by electrophoresis on a gel (typically a 
denaturing and reducing protein gel, but may be another type of gel, including isoelectric 
focusing gels and the like). Following separation of proteins, the angiogenesis protein is 
detected, e.g., by immunob lotting with antibodies raised against the angiogenesis protein. 

25 Methods of immunob lotting are well known to those of ordinary skill in the art. 

In another preferred method, antibodies to the angiogenesis protein find use in 
in situ imaging techniques, e.g., in histology (e.g., Methods in Cell Biology: Antibodies in 
Cell Biology, volume 37 (Asai, ed. 1993)). In this method cells are contacted with from one 
to many antibodies to the angiogenesis protein(s). Following washing to remove non-specific 

30 antibody binding, the presence of the antibody or antibodies is detected. In one embodiment 
the antibody is detected by incubating with a secondary antibody that contains a detectable 
label. In another method the primary antibody to the angiogenesis protein(s) contains a 
detectable label, for example an enzyme marker that can act on a substrate. In another 
preferred embodiment each one of multiple primary antibodies contains a distinct and 
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detectable label. This method finds particular use in simultaneous screening for a plurality of 
angiogenesis proteins. As will be appreciated by one of ordinary skill in the art, many other 
histological imaging techniques are alsoprovided by the invention. 

In a preferred embodiment the label is detected in a fluorometer which has the 
5 ability to detect and distinguish emissions of different wavelengths. In addition, a 
fluorescence activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing 
angiogenesis from biological samples, such as blood, urine, sputum, or other bodily fluids. 
As previously described, certain angiogenesis proteins are secreted/circulating molecules. 

10 Blood samples, therefore, are useful as samples to be probed or tested for the presence of 
secreted angiogenesis proteins. Antibodies can be used to detect an angiogenesis protein by 
previously described immunoassay techniques including ELISA, immunoblotting (Western 
blotting), immunoprecipitation, BIACORE technology and the like. Conversely, the presence 
of antibodies may indicate an immune response against an endogenous angiogenesis protein. 

15 In a preferred embodiment, in situ hybridization of labeled angiogenesis 

nucleic acid probes to tissue arrays is done. For example, arrays of tissue samples, including 
angiogenesis tissue and/or normal tissue, are made. In situ hybridization (see, e.g., Ausubel, 
supra) is then performed. When comparing the fingerprints between an individual and a 
standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the 

20 findings. It is further understood that the genes which indicate the diagnosis may differ from 
those which indicate the prognosis and molecular profiling of the condition of the cells may 
lead to distinctions between responsive or refractory conditions or may be predictive of 
outcomes. 

In a preferred embodiment, the angiogenesis proteins, antibodies, nucleic 
25 acids, modified proteins and cells containing angiogenesis sequences are used in prognosis 
assays. As above, gene expression profiles can be generated that correlate to angiogenesis 
severity, in terms of long term prognosis. Again, this may be done on either a protein or gene 
level, with the use of genes being preferred. As above, angiogenesis probes may be attached 
to biochips for the detection and quantification of angiogenesis sequences in a tissue or 
30 patient. The assays proceed as outlined above for diagnosis. PGR method may provide more 
sensitive and accurate quantification. 

In a preferred embodiment members of the three classes of proteins as 
described herein are used in drug screening assays. The angiogenesis proteins, antibodies, 
nucleic acids, modified proteins and cells containing angiogenesis sequences are used in drug 
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screening assays or by evaluating the effect of drug candidates on a "gene expression profile" 
or expression profile of polypeptides. In a preferred embodiment, the expression profiles are 
used, preferably in conjunction with high throughput screening techniques to allow 
monitoring for expression profile genes after treatment with a candidate agent (e.g., 
5 Zlokarnik, et al., Science 279, 84-8 (1 998); Heid, Genome Res 6:986-94, 1996). 

In a preferred embodiment, the angiogenesis proteins, antibodies', nucleic 
acids, modified proteins and cells containing the native or modified angiogenesis proteins are 
used in screening assays. That is, the present invention provides novel methods for screening 
for compositions which modulate the angiogenesis phenotype or an identified physiological 

10 function of an angiogenesis protein. As above, this can be done on an individual gene level 
or by evaluating the effect of drug candidates on a "gene expression profile". In a preferred 
embodiment, the expression profiles are used, preferably in conjunction with high throughput 
screening techniques to allow monitoring for expression profile genes after treatment with a 
candidate agent, see Zlokarnik, supra. 

15 Having identified the differentially expressed genes herein, a variety of assays 

may be executed. In a preferred embodiment, assays may be run on an individual gene or 
protein level. That is, having identified a particular gene as up regulated in angiogenesis, test 
compounds can be screened for the ability to modulate gene expression or for binding to the 
angiogenic protein. "Modulation" thus includes both an increase and a decrease in gene 

20 expression. The preferred amount of modulation will depend on the original change of the 
gene expression in normal versus tissue undergoing angiogenesis, with changes of at least 
10%, preferably 50%, more preferably 100-300%, and in some embodiments 300-1000% or 
greater. Thus, if a gene exhibits a 4-fold increase in angiogenic tissue compared to normal 
tissue, a decrease of about four-fold is often desired; similarly, a 10-fold decrease in 

25 angiogenic tissue compared to normal tissue often provides a target value of a 10-fold 
increase in expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes 
and the quantification of gene expression levels, or, alternatively, the gene product itself can 
be monitored, e.g., through the use of antibodies to the angiogenesis protein and standard 

30 immunoassays. Proteomics and separation techniques may also allow quantification of 
expression. 

In a preferred embodiment, gene expression or protein monitoring of a number 
of entitites, i.e., an expression profile, is monitored simultaneously. Such profiles will 
typically invove a plurality of those entitites described herein.. 
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In this embodiment, the angiogenesis nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of angiogenesis sequences in a 
particular cell. Alternatively, PCR may be used. Thus, a series, e.g., of microtiter plate, may 
be used with dispensed primers in desired wells. A PCR reaction can then be performed and 
5 analyzed for each well. 

Modulators of angiogenesis 

Expression monitoring can be performed to identify compounds that modify 
the expression of one or more angiogenesis-associated sequences, e.g., a polynucleotide 
1 0 sequence set out in Tables 1 -8 . Generally, in a preferred embodiment, a test modulator is 
added to the cells prior to analysis. Moreover, screens are also provided to identify agents 
that modulate angiogenesis, modulate angiogenesis proteins, bind to an angiogenesis protein, 
or interfere with the binding of an angiogenesis protein and an antibody or other binding 
partner. 

1 5 The term "test compound" or "drug candidate" or "modulator" or grammatical 

equivalents as used herein describes any molecule, e.g., protein, oligopeptide, small organic 
molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or 
indirectly alter the angiogenesis phenotype or the expression of an angiogenesis sequence, 
e.g., a nucleic acid or protein sequence. In preferred embodiments, modulators alter 

20 expression profiles, or expression profile nucleic acids or proteins provided herein. In one 
embodiment, the modulator suppresses an angiogenesis phenotype, for example to a normal 
tissue fingerprint. In another embodiment, a modulator induced an angiogenesis phenotype. 
Generally, a plurality of assay mixtures are run in parallel with different agent concentrations 
to obtain a differential response to the various concentrations. Typically, one of these 

25 concentrations serves as a negative control, i.e., at zero concentration or below the level of 
detection. 

In one aspect, a modulator will neutralize the effect of an angiogenesis protein. 
By "neutralize" is meant that activity of a protein is inhibited or blocked and thereby has 
substantially no effect on a cell. 
30 In certain embodiments, combinatorial libraries of potential modulators will be 

screened for an ability to bind to an angiogenesis polypeptide or to modulate activity. 
Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a "lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
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and activity of those variant compounds. Often, high throughput screening (HTS) methods 
are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve 
providing a library containing a large number of potential therapeutic compounds (candidate 
5 compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 
display a desired characteristic activity. The compounds thus identified can serve as 
conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 
A combinatorial chemical library is a collection of diverse chemical 

1 0 compounds generated by either chemical synthesis or biological synthesis by combining a 
number of chemical "building blocks" such as reagents. For example, a linear combinatorial 
chemical library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of 
chemical building blocks called amino acids in every possible way for a given compound 
length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical 

1 5 compounds can be synthesized through such combinatorial mixing of chemical building 
blocks (Gallop etal (1994) J. Med. Chem. 37(9): 1233-1251). 

Preparation and screening of combinatorial chemical libraries is well known to 
those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent No. 5,010,175, Furka (1991) Int. J. Pept. Prot. Res., 

20 37: 487-493, Houghton et al. (1991) Nature, 354: 84-88), peptoids (PCT Publication No WO 
91/19735, 26 Dec. 1991), encoded peptides (PCT Publication WO 93/20242, 14 Oct. 1993), 
random bio-oligomers (PCT Publication WO 92/00091, 9 Jan. 1992), benzodiazepines (U.S. 
Pat. No. 5,288,514), diversomers such as hydantoins, benzodiazepines and dipeptides (Hobbs 
et al, (1993) Proc. Nat. Acad. Set. USA 90: 6909-6913), vinylogous polypeptides (Hagihara 

25 et al. (1992) J. Amer. Chem. Soc. 1 14: 6568), nonpeptidal peptidomimetics with a Beta-D- 
Glucose scaffolding (Hirschmann et al, (1992) J. Amer. Chem. Soc. 114: 9217-9218), 
analogous organic syntheses of small compound libraries (Chen et al. (1994) J. Amer. Chem. 
Soc. 116: 2661), oligocarbamates (Cho, et al., (1993) Science 261:1303), and/or peptidyl 
phosphonates (Campbell etal, (1994) J. Org. Chem. 59: 658). See, generally, Gordon et al, 

30 (1994) J. Med. Chem. 37:1385, nucleic acid libraries (see, e.g., Strategene, Corp.), peptide 
nucleic acid libraries (see, e.g., U.S. Patent 5,539,083), antibody libraries (see, e.g., Vaughn 
etal. (1996) Nature Biotechnology, 14(3): 309-314), and PCT/US96/1 0287), carbohydrate 
libraries (see, e.g., Liang etal, (1996) Science, 274: 1520-1522, and U.S. Patent No. 
5,593,853), and small organic molecule libraries (see, e.g., benzodiazepines, Baum (1993) 
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C&EN, Jan 18, page 33; isoprenoids, U.S. Patent No. 5,569,588; thiazolidinones and 
metathiazanones, U.S. Patent No. 5,549,974; pyrrolidines, U.S. Patent Nos. 5,525,735 and 
5,519,134; morpholino compounds, U.S. Patent No. 5,506,337; benzodiazepines, U.S. Patent 
No. 5,288,514; and the like). 
5 Devices for the preparation of combinatorial libraries are commercially 

available (see, e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, 
Rainin, Woburn, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, 
Bedford, MA). 

A number of well known robotic systems have also been developed for 
10 solution phase chemistries. These systems include automated workstations like the 

automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, 
Japan) and many robotic systems utilizing robotic arms (Zymate II, Zymark Corporation, 
Hopkinton, Mass.; Orca, Hewlett-Packard, Palo Alto, Calif), which mimic the manual 
synthetic operations performed by a chemist. Any of the above devices are suitable for use 
15 with the present invention. The nature and implementation of modifications to these devices 
(if any) so that they can operate as discussed herein will be apparent to persons skilled in the 
relevant art. In addition, numerous combinatorial libraries are themselves commercially 
available (see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, Inc., St. Louis, 
MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, 
20 Columbia, MD, etc.). 

The assays to identify modulators are amenable to high throughput screening. 
Preferred assays thus detect enhancement or inhibition of angiogenesis gene transcription, 
inhibition or enhancement of polypeptide expression, and inhibition or enhancement of 
polypeptide activity. 

25 High throughput assays for the presence, absence, quantification, or other 

properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
for example, U.S. Patent No. 5,559,410 discloses high throughput screening methods for 
proteins, U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic 

30 acid binding (i.e., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 
throughput methods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available 
(see, e.g., Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 
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typically automate entire procedures, including all sample and reagent pipetting, liquid 
dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate 
for the assay. These configurable systems provide high throughput and rapid start up as well 
as a high degree of flexibility and customization. The manufacturers of such systems provide 
5 detailed protocols for various high throughput systems. Thus, for example, Zymark Corp. 
provides technical bulletins describing screening systems for detecting the modulation of 
gene transcription, ligand binding, and the like. 

In one embodiment, modulators are proteins, often naturally occurring 
proteins or fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing 

1 0 proteins, or random or directed digests of proteinaceous cellular extracts, may be used, hi 
this way libraries of proteins may be made for screening in the methods of the invention. 
Particularly preferred in this embodiment are libraries of bacterial, fungal, viral, and 
mammalian proteins, with the latter being preferred, and human proteins being especially 
preferred. Paticularly useful test compound will be directed to the class of proteins to which 

15 the target belongs, e.g., substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 
30 amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 
to about 15 being particularly preferred. The peptides may be digests of naturally occurring 
proteins as is outlined above, random peptides, or "biased" random peptides. By 

20 "randomized" or grammatical equivalents herein is meant that each nucleic acid and peptide 
consists of essentially random nucleotides and amino acids, respectively. Since generally 
these random peptides (or nucleic acids, discussed below) are chemically synthesized, they 
may incorporate any nucleotide or amino acid at any position. The synthetic process can be 
designed to generate randomized proteins or nucleic acids, to allow the formation of all or 

25 most of the possible combinations over the length of the sequence, thus forming a library of 
randomized candidate bioactive proteinaceous agents. 

In one embodiment, the library is fully randomized, with no sequence 
preferences or constants at any position. In a preferred embodiment, the library is biased. 
That is, some positions within the sequence are either held constant, or are selected from a 

30 limited number of possibilities. For example, in a preferred embodiment, the nucleotides or 
amino acid residues are randomized within a defined class, for example, of hydrophobic 
amino acids, hydrophilic residues, sterically biased (either small or large) residues, towards 
the creation of nucleic acid binding domains, the creation of cysteines, for cross-linking, 
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prolines for SH-3 domains, serines, threonines, tyrosines or histidines for phosphorylation 
sites, etc., or to purines, etc. 

Modulators of angiogenesis can also be nucleic acids, as defined above. 

As described above generally for proteins, nucleic acid modulating agents may 
5 be naturally occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. 
For example, digests of procaryotic or eucaryotic genomes may be used as is outlined above 
for proteins. 

In a preferred embodiment, the candidate compounds are organic chemical 
moieties, a wide variety of which are available in the literature. 

10 After the candidate agent has been added and the cells allowed to incubate for 

some period of time, the sample containing a target sequence to be analyzed is added to the 
biochip. If required, the target sequence is prepared using known techniques. For example, 
the sample may be treated to lyse the cells, using known lysis buffers, electroporation, etc., 
with purification and/or amplification such as PCR performed as appropriate. For example, 

15 an in vitro transcription with labels covalently attached to the nucleotides is performed. 
Generally, the nucleic acids are labeled with biotin-FITC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, for example, a 
fluorescent, a chenuluminesccnt, a chemical, or a radioactive signal, to provide a means of 
detecting the target sequence's specific binding to a probe. The label also can be an enzyme, 

20 such as, alkaline phosphatase or horseradish peroxidase, which when provided with an 

appropriate substrate produces a product that can be detected. Alternatively, the label can be 
a labeled compound or small molecule, such as an enzyme inhibitor, that binds but is not 
catalyzed or altered by the enzyme. The label also can be a moiety or compound, such as, an 
epitope tag or biotin which specifically binds to streptavidin. For the example of biotin, the 

25 streptavidin is labeled as described above, thereby, providing a detectable signal for the 

bound target sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

As will be appreciated by those in the art, these assays can be direct 
hybridization assays or can comprise "sandwich assays", which include the use of multiple 
probes, as is generally outlined in U.S. Patent Nos. 5,681,702, 5,597,909, 5,545,730, 

30 5,594,1 17, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 
5,594,1 18, 5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by 
reference. In this embodiment, in general, the target nucleic acid is prepared as outlined 
above, and then added to the biochip comprising a plurality of nucleic acid probes, under 
conditions that allow the formation of a hybridization complex. 
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A variety of hybridization conditions may be used in the present invention, 
including high, moderate and low stringency conditions as outlined above. The assays are 
generally run under stringency conditions which allows formation of the label probe 
hybridization complex only in the presence of target. Stringency can be controlled by 
5 altering a step parameter that is a thermodynamic variable, including, but not limited to, 
temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, 
organic solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is 
generally outlined in U.S. Patent No. 5,681,697. Thus it may be desirable to perform certain 
10 steps at higher stringency conditions to reduce non-specific binding. 

The reactions outlined herein may be accomplished in a variety of ways. 
Components of the reaction may be added simultaneously, or sequentially, in different orders, 
with preferred embodiments outlined below. In addition, the reaction may include a variety 
of other reagents. These include salts, buffers, neutral proteins, e.g. albumin, detergents, etc. 
1 5 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target. 

The assay data are analyzed to determine the expression levels, and changes in 
20 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the angiogenesis phenotype. 
In one embodiment, screening is performed to identify modulators that can induce or 
suppress a particular expression profile, thus preferably generating the associated phenotype. 
In another embodiment, e.g., for diagnostic applications, having identified differentially 
25 expressed genes important in a particular state, screens can be performed to identify 

modulators that alter expression of individual genes. In an another embodiment, screening is 
performed to identify modulators that alter a biological function of the expression product of 
a differentially expressed gene. Again, having identified the importance of a gene in a 
particular state, screens are performed to identify agents that bind and/or modulate the 
3 0 biological activity of the gene product. 

In addition screens can be done for genes that are induced in response to a 
candidate agent. After identifying a modulator based upon its ability to suppress an 
angiogenesis expression pattern leading to a normal expression pattern, or to modulate a 
single angiogenesis gene expression profile so as to mimic the expression of the gene from 
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normal tissue, a screen as described above can be performed to identify genes that are 
specifically modulated in response to the agent. Comparing expression profiles between 
normal tissue and agent treated angiogenesis tissue reveals genes that are not expressed in 
normal tissue or angiogenesis tissue, but are expressed in agent treated tissue. These agent- 
5 specific sequences can be identified and used by methods described herein for angiogenesis 
genes or proteins. In particular these sequences and the proteins they encode find use in 
marking or identifying agent treated cells. In addition, antibodies can be raised against the 
agent induced proteins and used to target novel therapeutics to the treated angiogenesis tissue 
sample. 

10 Thus, in one embodiment, a test compound is administered to a population of 

angiogenic cells, that have an associated angiogenesis expression profile. By 
"administration" or "contacting" herein is meant that the candidate agent is added to the cells 
in such a manner as to allow the agent to act upon the cell, whether by uptake and 
intracellular action, or by action at the cell surface. In some embodiments, nucleic acid 

1 5 encoding a proteinaceous candidate agent (i. e. , a peptide) may be put into a viral construct 
such as an adenoviral or retroviral construct, and added to the cell, such that expression of 
the peptide agent is accomplished, e.g., PCT US97/01019. Regulatable gene therapy systems 
can also be used. 

Once the test compound has been administered to the cells, the cells can be 
20 washed if desired and are allowed to incubate under preferably physiological conditions for 
some period of time. The cells are then harvested and a new gene expression profile is 
generated, as outlined herein. 

Thus, for example, angiogenesis tissue may be screened for agents that 
modulate, e.g., induce or suppress the angiogenesis phenotype. A change in at least one 
25 gene, preferably many, of the expression profile indicates that the agent has an effect on 

angiogenesis activity. By defining such a signature for the angiogenesis phenotype, screens 
for new drugs that alter the phenotype can be devised. With this approach, the drug target 
need not be known and need not be represented in the original expression screening platform, 
nor does the level of transcript for the target protein need to change. 
3 0 Measure of angiogenesis polypeptide activity, or of angiogenesis or the 

angiogenic phenotype can be performed using a variety of assays. For example, the effects of 
the test compounds upon the function of the anagiogenesis polypeptides can be measured by 
examining parameters described above. A suitable physiological change that affects activity 
can be used to assess the influence of a test compound on the polypeptides of this invention. 
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When the functional consequences are determined using intact cells or animals, one can also 
measure a variety of effects such as, in the case of angiogenesis associated with tumors, 
tumor growth, neovascularization, hormone release, transcriptional changes to both known 
and uncharacterized genetic markers (e.g., northern blots), changes in cell metabolism such as 
5 cell growth or pH changes, and changes in intracellular second messengers such as cGMP. In 
the assays of the invention, mammalian angiogenesis polypeptide is typically used, e.g., 
mouse, preferably human. 

A variety of angiogenesis assays are known to those of skill in the art. Various 
models have been employed to evaluate angiogenesis (e.g., Croix et ah, Science 289:1197- 

10 1202, 2000 and Kahn et al.,Amer. J. Pathol. 156:1887-1900). Assessement of angiogenesis 
in the presence of a potential modulator of angiogenesis can be performed using cell-cultre- 
based angiogenesis assays, e.g., endothelial cell tube formation assays, as well as other 
bioassays such as the chick CAM assay, the mouse corneal assay, and assays measuring the 
effect of administering potential modulators on implanted tumors. The chick CAM assay is 

15 described by O'Reilly, et al. Cell 79: 315-328, 1994. Briefly, 3 day old chicken embryos with 
intact yolks are separated from the egg and placed in a petri dish. After 3 days of incubation, 
a methylcellulose disc containing the protein to be tested is applied to the CAM of individual 
embryos. After about 48 hours of incubation, the embryos and CAMs are observed to 
determine whether endothelial growth has been inhibited. The mouse corneal assay involves 

20 implanting a growth factor-containing pellet, along with another pellet containing the 

suspected endothelial growth inhibitor, in the cornea of a mouse and observing the pattern of 
capillaries that are elaborated in the cornea. Angiogenesis can also be measured by 
determining the extent of neovascularization of a tumor. For example, carcinoma cells can be 
subcutaneously inoculated into athymic nude mice and tumor growth then monitored. The 

25 cancer cells are treated with an angiogenesis inhibitor, such as an antibody, or other 

compound that is exogenously administered, or can be transfected prior to inoculation with a 
polynucleotide inhibitor of angiogenesis. Immunoassays using endothelial cell-specific 
antibodies are typically used to stain for vascularization of tumor and the number of vessels 
in the tumor. 

30 Assays to identify compounds with modulating activity can be performed in 

vitro. For example, an angiogenesis polypeptide is first contacted with a potential modulator 
and incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. In one embodiment, 
the angiogenesis polypeptide levels are determined in vitro by measuring the level of protein 
or mRNA. The level of protein is measured using immunoassays such as western blotting, 
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ELISA and the like with an antibody that selectively binds to the angiogenesis polypeptide or 
a fragment thereof. For measurement of mRNA, amplification, e.g., using PCR, LCR, or 
hybridization assays, e.g., northern hybridization, RNAse protection, dot blotting, are 
preferred. The level of protein or mRNA is detected using directly or indirectly labeled 
5 detection agents, e.g. , fluorescently or radioactively labeled nucleic acids, radioactively or 
enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using the angiogenesis 
protein promoter operably linked to a reporter gene such as luciferase, green fluorescent 
protein, CAT, or p-gal. The reporter construct is typically transfected into a cell. After 

10 treatment with a potential modulator, the amount of reporter gene transcription, translation, or 
activity is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 
differentially expressed gene as important in a particular state, screening of modulators of the 

1 5 expression of the gene or the gene product itself can be done. The gene products of 

differentially expressed genes are sometimes referred to herein as "angiogenesis proteins". In 
preferred embodiments the angiogenesis protein comprises a sequence shown in Table 8. 
The angiogenesis protein may be a fragment, or alternatively, be the full length protein to a 
fragment shown herein. 

20 Preferably, the angiogenesis protein is a fragment of approximately 14 to 24 

amino acids long. More preferably the fragment is a soluble fragment. In one embodiment 
an angiogenesis protein is conjugated or fused to an immunogenic agent or BSA. 

In one embodiment, screening for modulators of expression of specific genes 
is performed. Typically, the expression of only one or a few genes are evaluated. In another 

25 embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate strucutre activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or 

30 isolated gene product is used; that is, the gene products of one or more differentially 

expressed nucleic acids are made. For example, antibodies are generated to the protein gene 
products, and standard immunoassays are run to determine the amount of protein present. 
Alternatively, cells comprising the angiogenesis proteins can be used in the assays. 
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Thus, in a preferred embodiment, the methods comprise combining an 
angiogenesis protein and a candidate compound, and determining the binding of the 
compound to the angiogenesis protein. Preferred embodiments utilize the human 
angiogenesis protein, although other mammalian proteins may also be used, for example for 
5 the development of animal models of human disease. In some embodiments, as outlined 
herein, variant or derivative angiogenesis proteins may be used. 

Generally, in a preferred embodiment of the methods herein, the angiogenesis 
protein or the candidate agent is non-diffusably bound to an insoluble support having isolated 
sample receiving areas (e.g. a microtiter plate, an array, etc.). The insoluble supports may be 

10 made of any composition to which the compositions can be bound, is readily separated from 
soluble material, and is otherwise compatible with the overall method of screening. The 
surface of such supports may be solid or porous and of any convenient shape. Examples of 
suitable insoluble supports include microtiter plates, arrays, membranes and beads. These are 
typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon or nitrocellulose, 

15 teflon™, etc. Microtiter plates and arrays are especially convenient because a large number 
of assays can be carried out simultaneously, using small amounts of reagents and samples. 
The particular manner of binding of the composition is not crucial so long as it is compatible 
with the reagents and overall methods of the invention, maintains the activity of the 
composition and is nondiffusable. Preferred methods of binding include the use of antibodies 

20 (which do not sterically block either the ligand binding site or activation sequence when the 
protein is bound to the support), direct binding to "sticky" or ionic supports, chemical 
crosslinking, the synthesis of the protein or agent on the surface, etc. Following binding of 
the protein or agent, excess unbound material is removed by washing. The sample receiving 
areas may then be blocked through incubation with bovine serum albumin (BSA), casein or 

25 other innocuous protein or other moiety. 

In a preferred embodiment, the angiogenesis protein is bound to the support, 
and a test compound is added to the assay. Alternatively, the candidate agent is bound to the 
support and the angiogenesis protein is added. Novel binding agents include specific 
antibodies, non-natural binding agents identified in screens of chemical libraries, peptide 

30 analogs, etc. Of particular interest are screening assays for agents that have a low toxicity for 
human cells. A wide variety of assays may be used for this purpose, including labeled in 
vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays for 
protein binding, functional assays (phosphorylation assays, etc.) and the like. 
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The determination of the binding of the test modulating compound to the 
angiogenesis protein may be done in a number of ways. In a preferred embodiment, the 
compound is labelled, and binding determined directly, e.g., by attaching all or a portion of 
the angiogenesis protein to a solid support, adding a labelled candidate agent (e.g., a 
5 fluorescent label), washing off excess reagent, and determining whether the label is present 
on the solid support. Various blocking and washing steps may be utilized as appropriate. 

By "labeled" herein is meant that the compound is either directly or indirectly 
labeled with a label which provides a detectable signal, e.g. radioisotope, fluoresces, 
enzyme, antibodies, particles such as magnetic particles, chemiluminescers, or specific 

10 binding molecules, etc. Specific binding molecules include pairs, such as biotin and 
streptavidin, digoxin and antidigoxin, etc. For the specific binding members, the 
complementary member would normally be labeled with a molecule which provides for 
detection, in accordance with known procedures, as outlined above. The label can directly or 
indirectly provide a detectable signal. 

1 5 In some embodiments, only one of the components is labeled, e.g. , the 

proteins (or proteinaceous candidate compounds) can be labeled. Alternatively, more than 
one component can be labeled with different labels, e.g., 125 I for the proteinsand a fluorophor 
for the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

20 In one embodiment, the binding of the test compound is determined by 

competitive binding assay. The competitor is a binding moiety known to bind to the target 
molecule (i.e. an angiogenesis protein), such as an antibody, peptide, binding partner, ligand, 
etc. Under certain circumstances, there may be competitive binding between the compound 
and the binding moiety, with the binding moiety displacing the compound. In one 

25 embodiment, the test compound is labeled. Either the compound, or the competitor, or both, 
is added first to the protein for a time sufficient to allow binding, if present. Incubations may 
be performed at a temperature which facilitates optimal activity, typically between 4 and 
40°C. Incubation periods are typically optimized, e.g., to facilitate rapid high throughput 
screening. Typically between 0.1 and 1 hour will be sufficient. Excess reagent is generally 

30 removed or washed away. The second component is then added, and the presence or absence 
of the labeled component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by the test 
compound. Displacement of the competitor is an indication that the test compound is binding 
to the angiogenesis protein and thus is capable of binding to, and potentially modulating, the 
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activity of the angiogenesis protein. In this embodiment, either component can be labeled. 
Thus, for example, if the competitor is labeled, the presence of label in the wash solution 
indicates displacement by the agent. Alternatively, if the test compound is labeled, the 
presence of the label on the support indicates displacement. 
5 In an alternative embodiment, the test compound is added first, with 

incubation and washing, followed by the competitor. The absence of binding by the 
competitor may indicate that the test compound is bound to the angiogenesis protein with a 
higher affinity. Thus, if the test compound is labeled, the presence of the label on the 
support, coupled with a lack of competitor binding, may indicate that the test compound is 

1 0 capable of binding to the angiogenesis protein. 

In a preferred embodiment, the methods comprise differential screening to 
identity agents that are capable of modulating the activitity of the angiogenesis proteins. In 
this embodiment, the methods comprise combining an angiogenesis protein and a competitor 
in a first sample. A second sample comprises a test compound, an angiogenesis protein, and 

15 a competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the angiogenesis protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 
agent is capable of binding to the angiogenesis protein. 

20 Alternatively, differential screening is used to identify drug candidates that 

bind to the native angiogenesis protein, but cannot bind to modified angiogenesis proteins. 
The structure of the angiogenesis protein may be modeled, and used in rational drug design to 
synthesize agents that interact with that site. Drug candidates that affect the activity of an 
angiogenesis protein are also identified by screening drugs for the ability to either enhance or 

25 reduce the activity of the protein. 

Positive controls and negative controls may be used in the assays. Preferably 
control and test samples are performed in at least triplicate to obtain statistically significant 
results. Incubation of all samples is for a time sufficient for the binding of the agent to the 
protein. Following incubation, samples are washed free of non-specifically bound material 

30 and the amount of bound, generally labeled agent determined. For example, where a 

radiolabel is employed, the samples may be counted in a scintillation counter to determine the 
amount of bound compound. 

A variety of other reagents may be included in the screening assays. These 
include reagents like salts, neutral proteins, e.g. albumin, detergents, etc. which may be used 
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to facilitate optimal protein-protein binding and/or reduce non-specific or background 
interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in an order that provides for the requisite binding. 
5 In a preferred embodiment, the invention provides methods for screening for a 

compound capable of modulating the activity of an angiogenesis protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising angiogenesis 
proteins. Preferred cell types include almost any cell. The cells contain a recombinant 
nucleic acid that encodes an angiogenesis protein. In a preferred embodiment, a library of 

1 0 candidate agents are tested on a plurality of cells. 

In one aspect, the assays are evaluated in the presence or absence or previous 
or subsequent exposure of physiological signals, for example hormones, antibodies, peptides, 
antigens, cytokines, growth factors, action potentials, pharmacological agents including 
chemotherapeutics, radiation, carcinogenics, or other cells (i.e. cell-cell contacts). In another 

15 example, the determinations are determined at different stages of the cell cycle process. 

In this way, compounds that modulate angiogenesis agents are identified. 
Compounds with pharmacological activity are able to enhance or interfere with the activity of 
the angiogenesis protein. Once identified, similar structures are evaluated to identify critical 
structural feature of the compound. 

20 In one embodiment, a method of inhibiting angiogenic cell division is 

provided. The method comprises administration of an angiogenesis inhibitor. In another 
embodiment, a method of inhibiting angiogenesis is provided. The method comprises 
administration of an angiogenesis inhibitor. In a further embodiment, methods of treating 
cells or individuals with angiogenesis are provided. The method comprises administration of 

25 an angiogenesis inhibitor. 

In one embodiment, an angiogenesis inhibitor is an antibody as discussed 
above. In another embodiment, the angiogenesis inhibitor is an antisense molecule. 

Polynucleotide modulators of angiogenesis 
30 Antisense Polynucleotides 

In certain embodiments, the activity of an angiogenesis-associated protein is 
downregulated, or entirely inhibited, by the use of antisense polynucleotide, i.e., a nucleic 
acid complementary to, and which can preferably hybridize specifically to, a coding mRNA 
nucleic acid sequence, e.g., an angiogenesis protein mRNA, or a subsequence thereof. 
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Binding of the antisense polynucleotide to the mRNA reduces the translation and/or stability 
ofthemRNA. 

In the context of this invention, antisense polynucleotides can comprise 
naturally-occurring nucleotides, or synthetic species formed from naturally-occurring 
5 subunits or their close homologs. Antisense polynucleotides may also have altered sugar 
moieties or inter-sugar linkages. Exemplary among these are the phosphorothioate and other 
sulfur containing species which are known for use in the art. Analogs are comprehended by 
this invention so long as they function effectively to hybridize with the angiogenesis protein 
mRNA. See, e.g., Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

1 0 Such antisense polynucleotides can readily be synthesized using recombinant 

means, or can be synthesized in vitro. Equipment for such synthesis is sold by several 
vendors, including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 
Antisense molecules as used herein include antisense or sense 

1 5 oligonucleotides. Sense oligonucleotides can, e.g. , be employed to block trancription by 

binding to the anti-sense strand. The antisense and sense oligonucleotide comprise a single- 
stranded nucleic acid sequence (either RNA or DNA) capable of binding to target mRNA 
(sense) or DNA (antisense) sequences for angiogenesis molecules. A preferred antisense 
molecule is for an angiogenesis sequences in Tables 1-8 , or for a ligand or activator thereof. 

20 Antisense or sense oligonucleotides, according to the present invention, comprise a fragment 
generally at least about 14 nucleotides, preferably from about 14 to 30 nucleotides. The 
ability to derive an antisense or a sense oligonucleotide, based upon a cDNA sequence 
encoding a given protein is described in, for example, Stein and Cohen (Cancer Res. 48:2659, 
1988) and van der Krol et al. (BioTechniques 6:958, 1988). 

25 

Ribozymes 

In addition to antisense polynucleotides, ribozymes can be used to target and 
inhibit transcription of angiogenesis-associated nucleotide sequences. A ribozyme is an RNA 
molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes have 
30 been described, including group I ribozymes, hammerhead ribozymes, hairpin ribozymes, 

RNase P, and axhead ribozymes (see, e.g., Castanotto et al. (1994) Adv. in Pharmacology 25: 
289-3 17 for a general review of the properties of different ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel et al. 
(1990) Nucl. Acids Res. 18: 299-304; Hampel et al. (1990) European Patent Publication No. 0 
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360 257; U.S. Patent No. 5,254,678. Methods of preparing are well known to those of skill in 
the art (see, e.g., Wong-Staal et al, WO 94/26877; Ojwang et al. (1993) Proc. Natl. Acad. 
Sci. USA 90: 6340-6344; Yamada et al. (1994) Human Gene Therapy 1: 39-45; Leavitt et al. 
(1995) Proc. Natl. Acad. Sci. USA 92: 699-703; Leavitt et al. (1994) Human Gene Therapy 5: 
5 1151-120; and Yamada et al. (1994) Virology 205: 121-126). 

Polynucleotide modulators of angiogenesis may be introduced into a cell 
containing the target nucleotide sequence by formation of a conjugate with a ligand binding 
molecule, as described in WO 91/04753. Suitable ligand binding molecules include, but are 
not limited to, cell surface receptors, growth factors, other cytokines, or other ligands that 

10 bind to cell surface receptors. Preferably, conjugation of the ligand binding molecule does 
not substantially interfere with the ability of the ligand binding molecule to bind to its 
corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 
or its conjugated version into the cell. Alternatively, a polynucleotide modulator of 
angiogenesis may be introduced into a cell containing the target nucleic acid sequence, e.g., 

15 by formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 

understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in addition to methods of treatment. 

Thus, in one embodiment, methods of modulating angiogenesis in cells or 
organisms are provided. In one embodiment, the methods comprise administering to a cell an 

20 anti-angiogenesis antibody that reduces or eliminates the biological activity of an 

endogeneous angiogenesis protein. Alternatively, the methods comprise administering to a 
cell or organism a recombinant nucleic acid encoding an angiogenesis protein. This may be 
accomplished in any number of ways. In a preferred embodiment, for example when the 
angiogenesis sequence is down-regulated in angiogenesis, such state may be reversed by 

25 increasing the amount of angiogenesis gene product in the cell. This can be accomplished, 
e.g., by overexpressing the endogeneous angiogenesis gene or administering a gene encoding 
the angiogenesis sequence, using known gene-therapy techniques, for example. In a 
preferred embodiment, the gene therapy techniques include the incorporation of the 
exogenous gene using enhanced homologous recombination (EHR), for example as described 

30 in PCT/US93/03868, hereby incorporated by reference in its entireity. Alternatively, for 
example when the angiogenesis sequence is up-regulated in angiogenesis, the activity of the 
endogeneous angiogenesis gene is decreased, for example by the administration of a 
angiogenesis antisense nucleic acid or other inhibitor, such as RNAi. 
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In one embodiment, the angiogenesis eproteins of the present invention may 
be used to generate polyclonal and monoclonal antibodies to angiogenesis proteins. 
Similarly, the angiogenesis proteins can be coupled, using standard technology, to affinity 
chromatography columns. These columns may then be used to purify angiogenesis 
5 antibodies useful for production, diagnostic, or therapeutic purposes. In a preferred 

embodiment, the antibodies are generated to epitopes unique to a angiogenesis protein; that 
is, the antibodies show little or no cross-reactivity to other proteins. The angiogenesis 
antibodies may be coupled to standard affinity chromatography columns and used to purify 
angiogenesis proteins. The antibodies may also be used as blocking polypeptides, as outlined 
10 above, since they will specifically bind to the angiogenesis protein. 



Methods of identifying variant angiogenesis-associated sequences 

Without being bound by theory, expression of various angiogenesis sequences 
is correlated with angiogenesis. Accordingly, disorders based on mutant or variant 

1 5 angiogenesis genes may be determined. In one embodiment, the invention provides methods 
for identifying cells containing variant angiogenesis genes, e.g., determining all or part of the 
sequence of at least one endogeneous angiogenesis genes in a cell. This may be 
accomplished using any number of sequencing techniques. In a preferred embodiment, the 
invention provides methods of identifying the angiogenesis genotype of an individual, e.g., 

20 determining all or part of the sequence of at least one angiogenesis gene of the individual. 
This is generally done in at least one tissue of the individual, and may include the evaluation 
of a number of tissues or different samples of the same tissue. The method may include 
comparing the sequence of the sequenced angiogenesis gene to a known angiogenesis gene, 
i.e., a wild-type gene. 

25 The sequence of all or part of the angiogenesis gene can then be compared to 

the sequence of a known angiogenesis gene to determine if any differences exist. This can be 
done using any number of known homology programs, such as Bestfit, etc. In a preferred 
embodiment, the presence of a a difference in the sequence between the angiogenesis gene of 
the patient and the known angiogenesis gene correlates with a disease state or a propensity 

30 for a disease state, as outlined herein. 

In a preferred embodiment, the angiogenesis genes are used as probes to 
determine the number of copies of the angiogenesis gene in the genome. 

In another preferred embodiment, the angiogenesis genes are used as probes to 
determine the chromosomal localization of the angiogenesis genes. Information such as 
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chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
chromosomal abnormalities such as translocations, and the like are identified in the 
angiogenesis gene locus. 

5 Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of an angiogenesis protein 
or modulator thereof, is administered to a patient. By "therapeutically effective dose" herein 
is meant a dose that produces effects for which it is administered. The exact dose will depend 
on the purpose of the treatment, and will be ascertainable by one skilled in the art using 

10 known techniques (e.g., Ansel et al, Pharmaceuitcal Dosage Forms and Drug Delivery, 
Lippincott, Williams & Wilkins Publishers, ISBN:0683305727; Lieberman (1992) 
Pharmaceutical Dosage Forms (vols. 1-3), Dekker, ISBN 0824770846, 08247691 8X, 
0824712692, 0824716981; Lloyd (1999) The Art, Science and Technology of Pharmaceutical 
Compounding, Amer. Pharmacutical Assn, ISBN 0917330889; and Pickar (1999) Dosage 

1 5 Calculations, Delmar Pub, ISBN 0766805042). As is known in the art, adjustments for 
angiogenesis degradation, systemic versus localized delivery, and rate of new protease 
synthesis, as well as the age, body weight, general health, sex, diet, time of administration, 
drug interaction and the severity of the condition may be necessary, and will be ascertainable 
with routine experimentation by those skilled in the art. 

20 A "patient" for the purposes of the present invention includes both humans and 

other animals, particularly mammals. Thus the methods are applicable to both human 
therapy and veterinary applications. In the preferred embodiment the patient is a mammal, 
preferably a primate, and in the most preferred embodiment the patient is human. 

The administration of the angiogenesis proteins and modulators thereof of the 

25 present invention can be done in a variety of ways as discussed above, including, but not 
limited to, orally, subcutaneously, intravenously, intranasally, transdermally, 
intraperitoneally, intramuscularly, intrapuhnonary, vaginally, rectally, or intraocularly. In 
some instances, for example, in the treatment of wounds and inflammation, the angiogenesis 
proteins and modulators may be directly applied as a solution or spray. 

30 The pharmaceutical compositions of the present invention comprise an 

angiogenesis protein in a form suitable for administration to a patient. In the preferred 
embodiment, the pharmaceutical compositions are in a water soluble form, such as being 
present as pharmaceutically acceptable salts, which is meant to include both acid and base 
addition salts. "Pharmaceutically acceptable acid addition salt" refers to those salts that retain 
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the biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 
sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
5 acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 

methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
"Pharmaceutically acceptable base addition salts" include those derived from inorganic bases 
such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 

10 potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamine, diethylamide, triethylamine, 
tripropylamine, and ethanolamine. 

1 5 The pharmaceutical compositions may also include one or more of the 

following: carrier proteins such as serum albumin; buffers; fillers such as microcrystalline 
cellulose, lactose, corn and other starches; binding agents; sweeteners and other flavoring 
agents; coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit 

20 dosage forms depending upon the method of administration. For example, unit dosage forms 
suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that angiogenesis protein modulators (e.g., antibodies, 
antisense constructs, ribozymes, small organic molecules, etc.) when administered orally, 
should be protected from digestion. This is typically accomplished either by complexing the 

25 molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 
packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise an angiogenesis 
protein modulator dissolved in a pharmaceutically acceptable carrier, preferably an aqueous 

30 carrier. A variety of aqueous carriers can be used, e.g., buffered saline and the like. These 
solutions are sterile and generally free of undesirable matter. These compositions may be 
sterilized by conventional, well known sterilization techniques. The compositions may 
contain pharmaceutically acceptable auxiliary substances as required to approximate 
physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents 
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and the like, for example, sodium acetate, sodium chloride, potassium chloride, calcium 
chloride, sodium lactate and the like. The concentration of active agent in these formulations 
can vary widely, and will be selected primarily based on fluid volumes, viscosities, body 
weight and the like in accordance with the particular mode of administration selected and the 
5 patient's needs {e.g. , Remington 's Pharmaceutical Science, 1 5th ed., Mack Publishing 
Company, Easton, Pennsylvania (1980) and Goodman and Gillman, The Pharmacologial 
Basis of 'Therapeutics, (Hardman, J.G, Limbird, L.E, Molinoff, P.B., Ruddon, R.W, and 
Gilman, A.G.,eds) TheMcGraw-Hill Companies, Inc., 1996). 

Thus, a typical pharmaceutical composition for intravenous administration 

10 would be about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per 
patient per day may be used, particularly when the drug is administered to a secluded site and 
not into the blood stream, such as into a body cavity or into a lumen of an organ. 
Substantially higher dosages are possible in topical administration. Actual methods for 
preparing parenterally administrable compositions will be known or apparent to those skilled 

15 in the art, e.g., Remington 's Pharmaceutical Science and Goodman and Gillman, The 
Pharmacologial Basis of Therapeutics, supra. 

The compositions containing modulators of angiogenesis proteins can be 
administered for therapeutic or prophylactic treatments. In therapeutic applications, 
compositions are administered to a patient suffering from a disease {e.g., a cancer) in an 

20 amount sufficient to cure or at least partially arrest the disease and its complications. An 

amount adequate to accomplish this is defined as a "therapeutically effective dose." Amounts 
effective for this use will depend upon the severity of the disease and the general state of the 
patient's health. Single or multiple administrations of the compositions may be administered 
depending on the dosage and frequency as required and tolerated by the patient. In any event, 

25 the composition should provide a sufficient quantity of the agents of this invention to 

effectively treat the patient. An amount of modulator that is capable of preventing or slowing 
the development of cancer in a mammal is referred to as a "prophylactically effective dose." 
The particular dose required for a prophylactic treatment will depend upon the medical 
condition and history of the mammal, the particular cancer being prevented, as well as other 

30 factors such as age, weight, gender, administration route, efficiency, etc. Such prophylactic 
treatments may be used, e.g., in a mammal who has previously had cancer to prevent a 
recurrence of the cancer, or in a mammal who is suspected of having a significant likelihood 
of developing cancer. 
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It will be appreciated that the present angiogenesis protein-modulating 
compounds can be administered alone or in combination with additional angiogenesis 
modulating compounds or with other therapeutic agent, e.g., other anti-cancer agents or 
treatments. 

5 In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 

comprising nucleic acid sequences set forth in Tables 1-8 , such as antisense polynucleotides 
or ribozymes, will be introduced into cells, in vitro or in vivo. The present invention provides 
methods, reagents, vectors, and cells useful for expression of angiogenesis-associated 
polypeptides and nucleic acids using in vitro (cell-free), ex vivo or in vivo (cell or 

10 organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 
expression of a protein or nucleic acid is application specific. Many procedures for 
introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 

1 5 plasma vectors, viral vectors and any of the other well known methods for introducing cloned 
genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, 
e.g., Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology 
volume 152 Academic Press, Inc., San Diego, CA (Berger), F.M. Ausubel et al., eds., Current 
Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & 

20 Sons, Inc., (supplemented through 1999), and Sambrook et al., Molecular Cloning - A 
Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, 1989. 

In a preferred embodiment, angiogenesis proteins and modulators are 
administered as therapeutic agents, and can be formulated as outlined above. Similarly, 

25 angiogenesis genes (including both the full-length sequence, partial sequences, or regulatory 
sequences of the angiogenesis coding regions) can be administered in a gene therapy 
application. These angiogenesis genes can include antisense applications, either as gene 
therapy (i.e. for incorporation into the genome) or as antisense compositions, as will be 
appreciated by those in the art. 

30 Angiogenesis polypeptides and polynucleotides can also be administered as 

vaccine compositions to stimulate HTL, CTL and antibody responses.. Such vaccine 
compositions can include, for example, lipidated peptides (e.g. ,Vitiello, A. et al., J. Clin. 
Invest. 95:341, 1995), peptide compositions encapsulated in poly(DL-lactide-co-glycolide) 
("PLG") microspheres (see, e.g., Eldridge, et al., Molec. Immunol. 28:287-294, 1991: Alonso 
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etal, Vaccine 12:299-306, 1994; Jones et al, Vaccine 13:675-681, 1995), peptide 
compositions contained in immune stimulating complexes (ISCOMS) (see, e.g., Takahashi et 
al, Nature 344:873-875, 1990; Hue* al, Clin Exp Immunol 113:235-243, 1998), multiple 
antigen peptide systems (MAPs) (see e.g., Tarn, J. P., Proc. Natl. Acad. Sci. U.S.A. 85:5409- 
5 5413, 1988; Tarn, J.P., J. Immunol. Methods 196:17-32, 1996), peptides formulated as 
multivalent peptides; peptides for use in ballistic delivery systems, typically crystallized 
peptides, viral delivery vectors (Perkus, M. E. et al, In: Concepts in vaccine development, 
Kaufmann, S. H. E., ed., p. 379, 1996; Chakrabarti, S. et al., Nature 320:535, 1986; Hu, S. L. 
et al, Nature 320:537, 1986; Kieny, M.-P. et al, AIDS Bio/Technology 4:790, 1986; Top, F. 

10 H. et al, J. Infect. Dis. 124:148, 1971; Chanda, P. K. et al, Virology 175:535, 1990), 
particles of viral or synthetic origin (e.g., Kofler, N. et al, J. Immunol. Methods. 192:25, 
1996; Eldridge, J. H. etal, Sem. Hematol. 30:16, 1993; Falo, L. D., Jr. et al, Nature Med. 
7:649, 1995), adjuvants (Warren, H. S., Vogel, F. R., and Chedid, L. A. Annu. Rev. Immunol. 
4:369, 1986; Gupta, R. K. et al, Vaccine 11:293, 1993), liposomes (Reddy, R. et al, J. 

15 Immunol. 148:1585, 1992; Rock, K. L., Immunol. Today 17:131, 1996), or, naked or particle 
absorbed cDNA (Ulmer, J. B. et al, Science 259:1745, 1993; Robinson, H. L., Hunt, L. A., 
and Webster, R. G., Vaccine 11:957, 1993; Shiver, J. W. et al, In: Concepts in vaccine 
development, Kaufmann, S. H. E., ed., p. 423, 1996; Cease, K. B., and Berzofsky, J. A., 
Annu. Rev. Immunol. 12:923, 1994 and Eldridge, J. H. etal, Sem. Hematol. 30:16, 1993). 

20 Toxin-targeted delivery technologies, also known as receptor mediated targeting, such as 
those of Avant Immunotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a 
substance designed to protect the antigen from rapid catabolism, such as aluminum hydroxide 
or mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 

25 Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 
as, for example, Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, 
Detroit, MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 
(SmithKline Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel 
(alum) or aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of 

30 acylated tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 
polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 
Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 
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Vaccines can be administered as nucleic acid compositions wherein DNA or 
RNA encoding one or more of the polypeptides, or a fragment thereof, is administered to a 
patient. This approach is described, for instance, in Wolff et. al, Science 247:1465 (1990) as 
well as U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; 
5 WO 98/04720; and in more detail below. Examples of DNA-based delivery technologies 
include "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, 
cationic lipid complexes, and particle-mediated ("gene gun") or pressure-mediated delivery 
{see, e.g., U.S. Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the 

10 invention can be expressed by viral or bacterial vectors. Examples of expression vectors 

include attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 
vaccinia virus, for example, as a vector to express nucleotide sequences that encode 
angiogenic polypeptides or polypeptide fragments. Upon introduction into a host, the 
recombinant vaccinia virus expresses the immunogenic peptide, and thereby elicits an 

1 5 immune response. Vaccinia vectors and methods useful in immunization protocols are 
described in, e.g., U.S. Patent No. 4,722,848. Another vector is BCG (Bacille Calmette 
Guerin). BCG vectors are described in Stover et al, Nature 351:456-460 (1991). A wide 
variety of other vectors useful for therapeutic administration or immunization e.g. adeno and 
adeno-associated virus vectors, retroviral vectors, Salmonella typhi vectors, detoxified 

20 anthrax toxin vectors, and the like, will be apparent to those skilled in the art from the 

description herein (see, e.g., Shata et al. (2000) Mol Med Today, 6: 66-71 ; Shedlock et al, J 
Leukoc Biol 68,:793-806, 2000; Hipp et al, In Vivo 14:571-85, 2000). 

Methods for the use of genes as DNA vaccines are well known, and include 
placing an angiogenesis gene or portion of an angiogenesis gene under the control of a 

25 regulatable promoter or a tissue-specific promoter for expression in an angiogenesis patient. 
The angiogenesis gene used for DNA vaccines can encode full-length angiogenesis proteins, 
but more preferably encodes portions of the angiogenesis proteins including peptides derived 
from the angiogenesis protein. In one embodiment, a patient is immunized with a DNA 
vaccine comprising a plurality of nucleotide sequences derived from an angiogenesis gene. 

30 For example, angiogenesis-associated genes or sequence encoding subfragments of an 
angiogenesis protein are introduced into expression vectors and tested for their 
immunogenicity in the context of Class I MHC and an ability to generate cytotoxic T cell 
responses. This procedure provides for production of cytotoxic T cell responses against cells 
which present antigen, including intracellular epitopes. 
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In a preferred embodiment, the DNA vaccines include a gene encoding an 
adjuvant molecule with the DNA vaccine. Such adjuvant molecules include cytokines that 
increase the immunogenic response to the angiogenesis polypeptide encoded by the DNA 
vaccine. Additional or alternative adjuvants are available. 
5 In another preferred embodiment angiogenesis genes find use in generating 

animal models of angiogenesis. When the angiogenesis gene identified is repressed or 
diminished in angiogenesic tissue, gene therapy technology, e.g., wherein antisense RNA 
directed to the angiogenesis gene will also diminish or repress expression of the gene. 
Animal models of angiogenesis find use in screening for modulators of an angiogenesis- 

10 associated sequence or modulators of angiogenesis. Similarly, transgenic animal technology 
including gene knockout technology, for example as a result of homologous recombination 
with an appropriate gene targeting vector, will result in the absence or increased expression 
of the angiogenesis protein. When desired, tissue-specific expression or knockout of the 
angiogenesis protein may be necessary. 

15 It is also possible that the angiogenesis protein is overexpressed in 

angiogenesis. As such, transgenic animals can be generated that overexpress the 
angiogenesis protein. Depending on the desired expression level, promoters of various 
strengths can be employed to express the transgene. Also, the number of copies of the 
integrated transgene can be determined and compared for a determination of the expression 

20 level of the transgene. Animals generated by such methods find use as animal models of 
angiogenesis and are additionally useful in screening for modulators to treat angiogenesis or 
to evaluate a therapeutic entity. 

Kits for Use in Diagnostic and/or Prognostic Applications 

25 For use in diagnostic, research, and therapeutic applications suggested above, 

kits are also provided by the invention. In the diagnostic and research applications such kits 
may include any or all of the following: assay reagents, buffers, angiogenesis-specific nucleic 
acids or antibodies, hybridization probes and/or primers, antisense polynucleotides, 
ribozymes, dominant negative angiogenesis polypeptides or polynucleotides, small molecules 

30 inhibitors of angiogenesis-associated sequences etc. A therapeutic product may include 
sterile saline or another pharmaceutically acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing directions 
(i.e., protocols) for the practice of the methods of this invention. While the instructional 
materials typically comprise written or printed materials they are not limited to such. Any 
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medium capable of storing such instructions and communicating them to an end user is 
contemplated by this invention. Such media include, but are not limited to electronic storage 
media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the 
like. Such media may include addresses to internet sites that provide such instructional 
5 materials. 

The present invention also provides for kits for screening for modulators of 
angiogenesis-associated sequences. Such kits can be prepared from readily available 
materials and reagents. For example, such kits can comprise one or more of the following 
materials: an angiogenesis-associated polypeptide or polynucleotide, reaction tubes, and 

10 instructions for testing angiogenic-associated activity. Optionally, the kit contains 

biologically active angiogenesis protein. A wide variety of kits and components can be 
prepared according to the present invention, depending upon the intended user of the kit and 
the particular needs of the user. Diagnosis would typically involve evaluation of a plurality 
of genes or products. The genes will be selected based on correlations with important 

1 5 parameters in disease winch may be identified in historical or outcome data. 

It is understood that the examples described above in no way serve to limit 
the true scope of this invention, but rather are presented for illustrative purposes. All 
publications, sequences of accession numbers, and patent applications cited in this 
20 specification are herein incorporated by reference as if each individual publication or patent 
application were specifically and individually indicated to be incorporated by reference. 



EXAMPLES 



25 Example 1: Tissue Preparation. Labeling Chips, and Fingerprints 
Purify total RNA from tissue using TRIzol Reagent 

Homogenize tissue samples in 1ml of TRIzol per 50mg of tissue using a 
Polytron 3100 homogenizer. The generator/probe used depends upon the tissue size. A 
generator that is too large for the amount of tissue to be homogenized will cause a loss of 

30 sample and lower RNA yield. TRIzol is added directly to frozen tissue, which is then 

homogenize. Following homogenization, insoluble material is removed by centrifugation at 
7500 x g for 15 min in a Sorvall superspeed or 12,000 x g for 10 min. in an Eppendorf 
centrifuge at 4°C. The clear homogenate is transferred to a new tube for use. The samples 
may be frozen now at -60° to -70°C (and kept for at least one month). The homogenate is 
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mixed with 0.2ml of chloroform per 1ml of TRIzol reagent used in the original 
homogenization and incubated at room temp, for 2-3 minutes. The aqueous phase is then 
separated by centrifugation and transferred to a fresh tube and the RNA precipitated using 
isopropyl alcohol. The pellet is isolated by centrifugation, washed, air-dried, resuspended in 
5 an appropriate volume of DEPC H 2 0, and the absorbance measured. 

Purification of poly A+ mRNA from total RNA is performed as follows. Heat 
an oligotex suspension to 37°C and mixing immediately before adding to RNA. The 
Elution Buffer is heated at 70°C. Warm up 2 x Binding Buffer at 65°C if there is precipitate 
in the buffer. Mix total RNA with DEPC-treated water, 2 x Binding Buffer, and Oligotex 

10 according to Table 2 on page 16 of the Oligotex Handbook. Incubate for 3 minutes at 65 °C. 
Incubate for 10 minutes at room temperature. Centrifuge for 2 minutes at 14,000 to 18,000 g. 
Remove supernatant without disturbing Oligotex pellet. A little bit of solution can be left 
behind to reduce the loss of Oligotex. Gently resuspend in Wash Buffer OW2 and pipet onto 
spin column. Centrifuge the spin column at full speed for 1 minute. Transfer spin column to 

15 a new collection tube and gently resuspend in Wash Buffer OW2 and centrifuge as describe 
herein. Transfer spin column to a new tube and elute with 20 to 100 ul of preheated (70oC) 
Elution Buffer. Gently resuspend Oligotex resin by pipetting up and down. Centrifuge as 
above. Repeat elution with fresh elution buffer or use first eluate to keep the elution volume 
low. Read absorbance, using diluted Elution Buffer as the blank. Before proceeding with 

20 cDNA synthesis, precipitate the mRNA as follows: add 0.4 vol. of 7.5 M NH40Ac + 2.5 vol. 
of cold 100% ethanol. Precipitate at -20oC 1 hour to overnight (or 20-30 min. at -70oC). 
Centrifuge at 14,000-16,000 x g for 30 minutes at 4oC. Wash pellet with 0.5ml of 
80%ethanol (-20oC) then centrifuge at 14,000-16,000 x g for 5 minutes at room temperature- 
Repeat 80% ethanol wash. Air dry the ethanol from the pellet in the hood.. Suspend pellet in 

25 DEPC H 2 0 at lug/ul concentration. 

To further Clean up total RNA using Qiagen's RNeasy kit, add no more than 
lOOug to an RNeasy column. Adjust sample to a volume of lOOul with RNase-free water. 
Add 350ul Buffer RLT then 250ul ethanol (100%) to the sample. Mix by pipetting (do not 
centrifuge) then apply sample to an RNeasy mini spin column. Centrifuge for 15 sec at 

30 >1 0,000rpm. Transfer column to a new 2-ml collection tube. Add 500ul Buffer RPE and 
centrifuge for 15 sec at >10,000rpm. Discard flowtaough. Add 500ul Buffer RPE and 
centrifuge for 15 sec at >10,000rpm. Discard flowthrough then centrifuge for 2 min at 
maximum speed to dry column membrane. Transfer column to a new 1 .5-ml collection tube 
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and apply 30-50ul of RNase-free water directly onto column membrane. Centrifuge 1 min at 
>10,000rpm. Repeat elution. and read absorbance. 

cDNA synthesis using Gibco's "Superscript Choice System for cDNA Synthesis" kit 
5 First Strand cDNA synthesis is performed as follows. Use 5ug of total RNA 

or lug of poly A+ mRNA as starting material. For total RNA, use 2ul of Superscript RT. For 
polyA+ mRNA, use lul of Superscript RT. Final volume of first strand synthesis mix is 
20ul. RNA must be in a volume no greater than lOul. Incubate RNA with lul of lOOpmol 
T7-T24 oligo for 1 0 min at 70C. On ice, add 7 ul of: 4ul 5X 1 st Strand Buffer, 2ul of 0. 1M 

10 DTT, and 1 ul of lOmM dNTP mix. Incubate at 37C for 2 min then add Superscript RT. 
Incubate at 37C for 1 hour. 

For the second strand synthesis, place 1st strand reactions on ice and add: 9 lul 
DEPC H 2 0; 30ul 5X 2nd Strand Buffer; 3ul lOmM dNTP mix; lul lOU/ul E.coli DNA 
Ligase; 4ul lOU/ul E.coli DNA Polymerase; and lul 2U/ul RNase H. Mix and incubate 2 

15 hours at 16C. Add 2ul T4 DNA Polymerase. Incubate 5 min at 16C. Add lOul of 0.5M 
EDTA. A further clean-up of DNA is performed using phenol:chloroform:isoamyl Alcohol 
(25:24:1) purification. 

In vitro Transcription (TVT) and labeling with biotin is performed as follows: 
Pipet 1 .5ul of cDNA into a thin-wall PCR tube. Make NTP labeling mix by combining 2ul T7 

20 1 OxATP (75mM) (Ambion); 2ul T7 1 OxGTP (75mM) (Ambion); 1 .5ul T7 1 OxCTP (75mM) 
(Ambion); 1.5ul T7 lOxUTP (75mM) (Ambion); 3.75ul lOmM Bio-1 1-UTP (Boehringer- 
Mannheim/Roche or Enzo); 3.75ul lOmMBio-16-CTP (Enzo); 2ul lOx T7 transcription 
buffer (Ambion); and 2ul lOx T7 enzyme mix (Ambion). The final volume is 20ul. Incubate 
6 hours at 37°C in a PCR machine. The RNA can be furthered cleaned. 

25 Fragmentation is performed as follows. 15 ug of labeled RNA is usually 

fragmented. Try to minimize the fragmentation reaction volume; a 10 ul volume is 
recommended but 20 ul is all right. Do not go higher than 20 ul because the magnesium in 
the fragmentation buffer contributes to precipitation in the hybridization buffer. Fragment 
RNA by incubation at 94 C for 35 minutes in 1 x Fragmentation buffer (5 x Fragmentation 

30 buffer is 200 mM Tris-acetate, pH 8. 1 ; 500 mM KOAc; 150 mM MgOAc). The labeled 
RNA transcript can be analyzed before and after fragmentation. Samples can be heated to 
65°C for 15 minutes and electrophoresed on 1% agarose/TBE gels to get an approximate idea 
of the transcript size range 
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For hybridization, 200 ul (lOug cRNA) of a hybridization mix is put on the 
chip. If multiple hybridizations are to be done (such as cycling through a 5 chip set), then it 
is recommended that an initial hybridization mix of 300 ul or more be made. The 
hybridization mix is: fragment labeled RNA (50ng/ul final cone); 50 pM 948-b control 
5 oligo; 1 .5 pM BioB; 5 pM BioC; 25 pM BioD; 100 pM CRE; 0. lmg/ml herring sperm DNA; 
0.5mg/ml acetylated BSA; and 300 ul with lxMES hyb buffer. 

Labeling is performed as follows: The hybridization reaction includes non- 
biotinylated IVT (purified by RNeasy columns); IVT antisense RNA 4 pg:pl; random 
Hexamers (1 pg/pl) 4 ul and water to 14 ul. The reaciton is incubated at 70°C, 10 min. 

10 Reverse transcriptionis performed in the following reaction: 5X First Strand (BRL) buffer, 6 
ul; 0. 1 M DTT, 3 ul; 50X dNTP mix, 0.6 ul; H 2 0, 2.4 ul; Cy3 or Cy5 dUTP (ImM), 3 ul; SS 
RT II (BRL), 1 pi in a final volume of 16 ul. Add to hybridization reaction, incubate 30 
min., 42°C. Add 1 pi SSII and incubate another hour. Put on ice. 50X dNTP mix (25mM of 
cold dATP, dCTP, and dGTP, lOmM of dTTP: 25 pi each of lOOmM dATP, dCTP, and 

1 5 dGTP; 1 0 pi of 1 OOmM dTTP to 1 5 pi H20. dNTPs from Pharmacia) 

RNA degradation is performed as follows. Add 86 pi H20, 1.5 pi 1M NaOH/ 
2mM EDTA and incubate at 65°C, 10 min.. For U-Con 30, 500 pi TE/sample spin at 7000g 
for 1 0 min, save flow through for purification. For Qiagen purification, suspend u-con 
recovered material in 500pl buffer PB and proceed using Qiagen protocol. For DNAse 

20 digestion, add 1 ul of 1/100 dil of DNAse/30ul Rx and incubate at 37°C for 15 min. Incubate 
at 5 min 95°C to denature the DNAse/ 

For sample preparation, add Cot-1 DNA, 10 pi; 50X dNTPs, 1 pi; 20X SSC, 
2.3 pi; Na pyro phosphate, 7.5 pi; lOmg/ml Herring sperm DNA; lul of 1/10 dilution to 21.8 
final vol. Dry in speed vac. Resuspendin 15 plH20. Add 0.38 pi 10% SDS. Heat 95 °C, 2 

25 min and slow cool at room temp, for 20 min. Put on slide and hybridize overnight at 64°C. 
Washing after the hybridization: 3X SSC/0.03% SDS: 2 min., 37.5 mis 20X SSC+0.75mls 
10% SDS in 250mls H20; IX SSC: 5 min., 12.5 mis 20X SSC in 250mls H20; 0.2X SSC: 5 
min., 2.5 mis 20X SSC in 250mls H20. Dry slides and scan at appropiate PMT's and 
channels. 

30 

Example 2. A model of angiogenesis is used to determine expression in angiogenesis 

In the model of angiogenesis used to determine expression of angiogenesis- 
associated sequences, human umbilical vein endothelial cells (HUVEC) were obtained, e.g., 



80 



WO 02/079492 



PCT/US02/04915 



as passage 1 (pi) frozen cells from Cascade Biologies (Oregon) and grown in maintenance 
medium: Medium 199 (Life Technologies) supplemented with 20% pooled human serum, 
100 mg/ml heparin and 75 mg/ml endothelial cell growth supplements (Sigma) and 
gentamicin (Life Technologies). An in vitro cell system model was used in which 2x1 0 5 
5 HUVECs were cultured in 0.5 ml 3 mgs/ml plasminogen-depleted fibrinogen (Calbiochem, 
San Diego, CA) that was polymerized by the addition of 1 unit of maintenance medium 
supplemented with 100 ng/ml VEGF and HGF and 10 ng/ml TGF-a (R&D Systems, 
Minneapolis,MN) added (growth medium). The growth medium was replaced every 2 days. 
Samples for RNA were collected, e.g., at 0, 2, 6, 15, 24, 48, and 96 hours of culture. The 

1 0 fibrin clots were placed in Trizol (Life Technologies) and disrupted using a Tissuemizer. 
Thereafter standard procedures were used for extracting the RNA (e.g., Example 1). 

Angiogenesis associated sequences thus identified are shown in Tables 1-8 . 
As indicated, some of the Accession numbers include expression sequence tags (ESTs). 
Thus, in one embodiment herein, genes within an expression profile, also termed expression 

1 5 profile genes, include ESTs and are not necessarily full length. 
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UnigenelD: 
Unigene Title: 



Unique Eos probeset identifier number 
Accession number used for previous patent filings 
Exemplar Accession number, Genbank accession number 
Unigene number 



Pkey Accession 






134404 AB000450 


AB000450 




121443 AB002380 


AF1 80681 


Hs 6582 


100082 AB003103 


AA1 30080 




132817 AB004884 




Hs.57553 


130150 AF000573_ma1 


BE094848 


Hs. 15113 


100104 AF003937 


AF008937 


Hs. 102178 


130839 AF009301 


AB011169 




427064 AF009368 


AF029674 


Hs.173422 


100113 D00591 


NM_001269 


Hs.84746 


133980 D00760 


M294921 


Hs.250811 


100129 D11139 


AA469369 


Hs.5831 


100154 D14657 


H60720 


Hs.81892 


100169 D14878 








NM 002410 


Hs. 121502 


100190 D21090 




Hs 178658 


134742 D26135 


NM__001346 




100211 D26528 


D26528 




100238 D30742 






130283 D31762 


NM_012288 




134237 D31765 


D31765 


Hs. 1701 14 


100248 D31888 


NM_015156 




100256 D38128 






100262 D3850Q 






134329 D38551 






100281 D42087 


AF091035 




100294 D4939S 


AA331881 


Hs.75454 


100327 D55640 


D55640 




100335 D63391 


AW247529 


Hs.D/UJ 


134495 D63477 


D63477 




100338 D63483 


D86864 




135152 D64015 


M96954 


Hs. 182741 


134269 D79990 


NM_014737 




100372 D79997 


NM 014791 




134304 D80010 


BE613486 


Hs.81412 


100394 D84276 


D84284 


Hs 66052 


100405 D86425 


AW291587 


Hs.82733 


100418 D86978 




Hs.84790 


133154 D87012 


D87012 


Hs. 194685 


134347 D87075 


AF164142 




128653 D87432 


D87432 




100438 D87448 


AA013051 


H91417 


134593 D87845 


NMJ00437 


Hs.234392 


100481 HG1098-HT1098 


X70377 


Hs.121489 


100552 HG2167-HT2237 M019521 


Hs.301946 


100591 HG2415-HT2511 NM_004091 


Hs.231444 


cds 






100652 HG2825-HT2949 BE613608 


Hs.142653 


100662 HG2887-HT3031_r 


AI368680 


100899 HG4660-HT5073 AL039123 


Hs.103042 


100905 HG4704-HT5146 L12260 


Hs.172816 


100945 HG884-HT884 


AF002225 


Hs.180686 


100950 HG919-HT919 


AF128542 


Hs.166846 


100964 J00212J 


J00212 




135407 J04029 


J04029 


Hs.99936 


130149 J04031 


AW067805 


Hs.172665 


131877 J04088 


J04088 


Hs.156346 


101016 J04543 


J04543 


Hs.78637 


134786 L06139 


T29618 


Hs.89640 


134100 L07540 


AA460085 


Hs.171075 


134078 L08895 


L08895 


Hs.78995 


101132 L11239 


L11239 




134849 L11353 


BE409525 


Hs!902 


106432 L13773 


AK000310 


Hs.17138 



vaccinia related kinase 2 

Rho guanine exchange factor (GEF) 12 

proteasome (prosome ; macropain) 26S subunit, non-ATPase, 12 

tousled-like kinase 2 

homogeniisate 1,2-dioxygenase (homogentisate oxidase) 
synlaxin 16 

similar to S. cerevisiae SSM4 
KIAA1605 protein 
chromosome condensation 1 

v-ral simian leukemia viral oncogene homolog B (ras related; GTP binding protein) 
tissue inhibitor of metalloproteinase 1 (erythroid potentiating activity, collagenase inhibitor) 
KIAA0101 gene product 
D123 gene product 

mannosyl (alpha-1 ,6-)-glycoprotein beta-1 ,6-N-acetyl-glucosaminyltransferase 
RAD23 (S. cerevisiae) homolog B 
diacylglycerol kinase, gamma (90kD) 

DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 7 (RNA helicase, 52kD) 
cal ii 1 1 calmoaulin-dependent protein kinase IV 
T RAM-like protein 
KIAA0061 protein 
KIAA0071 protein 

prostaglandin 12 (prostacyclin) receptor (IP) 
postmeiotic segregation increased 2-like 4 
RAD21 (S.pombe) homolog 
KIAA01 18 protein 
peroxiredoxln 3 

gb:Human monocyte PABL (pseudoautosomal boundary-like sequence) mRNA, clone Mo2. 

platelet-activating factor acetylhydralase, isoform lb, gamma subunit (29kD) 

KIAA0143 protein 

acetyl LDL receptor; SREC 

TIA1 cytotoxic granule-associated RNA-binding protein-like 1 
Ras association (RaIGDS/AF-6) domain family 2 
KIAA0175gene product 
lipin 1 

CD38 antigen (p45) 
nidogen 2 
KIAA0225 protein 
topoisomerase (DNA) III beta 

solute carrier family 23 (nucleobase transporters), member 1 

solute earner family 7 (cationic amino acid transporter, y+ system), member 6 

topoisomerase (DNA) II binding protein 

platelet-activating factor acetyihydrolase 2 (40kD) 

cystatin D 

lysosomal 

Homo sapiens, Similar to hypothetical protein PR01722, clone MGC:15692, mRNA, complete 
ret finger protein 

Hs.316 SRY (sex determining region Y)-box 2 
microtubule-associated protein 1B 
neuregulin 1 

ubiquitin protein ligase E3A (human papilloma virus E6-associated protein, Angelman syndrome) 

polymerase (DNA directed), epsilon 

Empirically selected from AFFX single probeset 

keratin 10 (epiderrr.olytic hyperkeratosis; keratosis palmaris et plantaris) 

methylenetetrahydrofolate dehydrogenase (NADP+ dependent), methenyltetrahydrofolate 

topoisomerase (DNA) II alpha (170kD) 

annexinA7 

TEK tyrosine kinase, endothelial (venous malformations, multiple cutaneous and mucosal) 
replication factor C (activator 1) 5 (36.5kD) 

MADS box transcription enhancer factor 2, polypeptide C (myocyte enhancer factor 2C) 
gastrulation brain homeo box 1 
neurofibroma 2 (bilateral acoustic neuroma) 
hypothetical protein FLJ20303 
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101152 


L13800 


AI984625 


Hs.9884 


135397 


L14922 


L14922 


Hs.166563 


131687 


L15189 


BE297635 


Hs.3069 


101168 


L15388 


NM.005308 


Hs.211569 


421155 


L16895 


H87879 


Hs.102267 


101226 


L27476 


AF083892 


Hs.75608 


133975 


L27624 


C18355 


Hs.295944 


134739 


L32976 


NMJ02419 


Hs.89449 


130155 


L33404 


AA101043 


Hs.151254 


440538 


L35263 


W76332 


Hs.79107 


132813 


L37347 


BE313625 


Hs.57435 


101294 


L40371 


AF168418 


Hs.1 16784 


101300 


L40391 


BE535511 


Hs.74137 


101310 


L41607 


L41607 


Hs.934 


130344 


L77566 


AW250122 


Hs.154879 


embryonic lethal 






101381 


M13928 


AW675039 


Hs.1227 


101668 


M14016 


AW005903 


Hs.78601 


133780 


M14219 


AA557660 


Hs.76152 


101396 


M15796 


BE267931 


Hs.78996 


101447 


M21305 


M21305 




101458 


M22092 


M22092 




101470 


M22898 


NM 000546 


Hs.1846 






NMJ02884 


Hs.865 


101478 


M23379 


NMJ02890 


Hs.758 


406698 


M24364 


X03068 


Hs.73931 


133519 


M24400 


AW583062 


Hs.74502 


131185 


M25753 


BE280074 


Hs.23960 


134116 


M27691 


R84694 


Hs.79194 


133999 


M28213 


AA535244 


Hs.78305 


130174 


M29550 


M29551 


Hs.151531 


129963 


M29971 


M29971 


Hs.1384 


132983 


M30269 


M30269 


Hs.62041 


133900 


M31158 


M31158 


Hs.77439 


101543 


M31166 


M31166 


Hs.2050 


101545 


M31210 


BE246154 


Hs.154210 


101620 


M55420 


S55271 


Hs.247930 


134691 


M59979 


AW382987 


Hs.88474 


133595 


M62810 


AA393273 


Hs.75133 


130425 


M63838 


AA243383 


Hs. 155530 


101700 


M64710 


D90337 


Hs.247916 


101714 


M68874 


M68874 


Hs.211587 


134246 


M74524 


D28459 


Hs.80612 


101760 


M80254 


M80254 


Hs.173125 


133948 


M81780_cds3 


X59960 


Hs.77813 


101791 


M83822 


M83822 


Hs.62354 


101812 


M86934 


BE439894 


Hs.78991 


101813 


M87338 


NMJ02914 


Hs.139226 


133396 


M96326_rna1 


M96326 


Hs.72885 


135152 


M96954 


M96954 


Hs.182741 


129026 


M98833 


AL120297 • 


Hs.108043 


101901 


S65793 


H38026 


Hs.308 


134831 


S72370 


AA853479 


Hs.89890 


134039 


S78569 


NM 002290 


Hs.78672 


134395 


S79873 


AA456539 


Hs.8262 


101975 


S83325 


AA079717 


Hs.283664 


101977 


S83364 


AF112213 


Hs.184062 


101978 


S83365 


BE561610 


Hs.5809 


interacting factor) 






101998 


1)01212 


U01212 


Hs.248153 


102003 


U01922 


U01922 


Hs.1 25565 


102007 


U02556 


U02556 


Hs.75307 


102009 


U02680 


BE245149 


Hs.82643 


416658 


U03272 


U03272 


Hs.79432 


132951 


U04209 


AW821182 


Hs.61418 


135389 


U05237 


U05237 


Hs.99872 


102048 


U07225 


U07225 


Hs.339 


130145 


U07620 


U34820 


Hs.1 51 051 


303153 


U09759 




Hs.246857 


420269 


U09820 


U72937 


Hs.96264 


102095 


1)11313 


U11313 


Hs.75760 


102123 


U14518 


NMJM1809 


Hs.1594 


102126 


U14575 


AW950870 


Hs.78951 


102133 


U15173 


AU076845 


Hs.155596 


102139 


U15932 


NM 004419 


Hs.2128 


102162 


U18291 


AA450274 


Hs.1592 



spindle pole body protein 
replication factor C (activator 1) 1 (145kD) 
heat shock 70kD protein 9B (mortalin-2) 
G protein-coupled receptor kinase 5 
lysyl oxidase 

tight junction protein 2 (zona occludens 2) 
tissue factor pathway inhibitor 2 
mitogen-activated protein kinase kinase kinase 11 
kallikrein 7 (chymotryptic, stratum corneum) 
mitogen-activated protein kinase 14 

solute carrier family 1 1 (proton-coupled divalent metal ion transporters), member 2 

thyroid hormone receptor interactor 4 

transmembrane trafficking protein 

glucosaminyl (N-acetyl) transferase 2, l-branching enzyme 

DiGeorge syndrome critical region gene DGSI; likely ortholog of mouse expressed sequence 2 

aminolevulinate, delta-, dehydratase 
uroporphyrinogen decarboxylase 
decorin 

proliferating cell nuclear antigen 

gb:Human alpha satellite and satellite 3 junction DNA sequence. 

gb:Human neural cell adhesion molecule (N-CAM) gene, exon SEC and partial cds. 

tumor protein p53 (Li-Fraumeni syndrome) 

RAP1A, member of RAS oncogene family 

RAS p21 protein activator (GTPase activating protein) 1 

major histocompatibility complex, class II, DQ beta 1 

chymotrypsinogen B1 

cyclin B1 

cAMP responsive element binding protein 1 
RAB2, member RAS oncogene family 

protein phosphatase 3 (formerly 2B), catalytic subunit, beta isoform (calcineurin A beta) 
O-6-methylguanine-DNA methyltransferase 
nldogen (enactin) 

protein kinase, cAMP-dependent, regulatory, type II, beta 
pentaxin-related gene, rapidly induced by IL-1 beta 
endothelial differentiation, sphingolipid G-protein-coupled receptor, 1 
Epsilon , IgE 

prostaglandin-endoperoxide synthase 1 (prostaglandin G/H synthase and cyclooxygenase) 

transcription factor 6-like 1 (mitochondrial transcription factor 1-like) 

interferon, gamma-inducibie protein 15 

natriuretic peptide precursor C 

phospholipase A2, group IVA (cytosolic, calcium-dependent) 

ubiquitin-conjugating enzyme E2A (RAD6 homolog) 

peptidylprolyl isomerase F (cyclophilin F) 

sphingomyelin phosphodiesterase 1, acid lysosomal (acid sphingomyelinase) 
cell division cycle 4-like 

DNA segment, numerous copies, expressed probes (GS1 gene) 

replication factor C (activator 1) 2 (40kD) 

azurocidin 1 (caticnic antimicrobial protein 37) 

TIA1 cytotoxic granule-associated RNA-binding protein-like 1 

Friend leukemia virus integration 1 

arrestin 3, retinal (X-arrestin) 

pyruvate carboxylase 

laminin, alpha 4 



olfactory marker protein 

translocase of inner mitochondrial membrane 8 (yeast) homolog A 
t-complex-asscciated-testis-expressed 1-like 
protein tyrosine kinase 9 

fibrillin 2 (congenital contracture! arachnodactyly) 

microfibrillar-associated protein 1 

fetal Alzheimer anligen 

purinergic receptor P2Y, G-protein coupled, 2 

mitogen-activated protein kinase 10 

mitogen-activated protein kinase 9 

alpha thalassemia/mental retardation syndrome X-linked (RAD54 (S. cerevisiae) homolog)- 

sterol carrier protein 2 

centromere protein A (17kD) 

protein phosphatase 1, regulatory (inhibitor) subunit 8 

BCL2/adenovirus E1B 19kD-interacting protein 2 

dual specificity phosphatases 

CDC16 (cell division cycle 16, S. cerevisiae, homolog) 
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102164 


U18300 


NM 000107 


Hs.77602 


427653 


U18383 


AA159001 


Hs.180069 


131817 


U20536 


U20536 


Hs.3280 


102200 




AA232362 


Hs 157205 


102210 


U23028 


BE619413 


Hs.2437 


102214 


U23752 


U23752 


Hs.32964 


132811 


U25435 


U25435 


Hs.57419 


131319 


U25997 


NM 003155 


Hs.25590 


102256 


U28251 cds2 


U28251 


Hs.53237 


132316 


U28831 


U28831 


Hs.44566 


102269 


U30245 


U30245 




134365 


U32315 


AA568906 


Hs.82240 


102293 


U32439 


AF090118 


Hs.79348 


102298 


U32849 


AA382169 


Hs.54483 


102325 


U35139 


AI815867 


Hs.50130 


302344 


U36764 


BE303044 


Hs.1 92023 


102361 


U39400 


AA223616 


Hs.75859 


102367 U39657 


U39656 


Hs.1 18825 


102388 1)41344 


AA362907 


Hs.76494 


102394 U41766 


NMJJ03816 


Hs.2442 


129829 


U41813 


AF010258 


Hs.127428 


102251 


U41815 


NM_004398 


Hs.41706 


102409 


U43286 


BE300330 


Hs.1 18725 


133746 


U44378 


AW410035 


Hs.75862 


102423 


U44754 


Z47542 


Hs.179312 


132828 


U47011_cds1 


AB014615 


Hs.57710 


130441 


U47077 


U63630 


Hs.155637 


102450 


U48251 


U48251 


Hs.75871 


129350 


U50535 


U50535 


Hs.110630 


102534 


U56833 


U96759 


Hs.198307 


130457 


U58091 


AB014595 


Hs.155976 


135065 


U58837 


AA019401 


Hs.93909 


102560 


U59289 


R97457 




102567 


U59863 


U63830 


Hs!l46847 


134305 


U67122 


U61397 


Hs.81424 


102638 


U67319 


U67319 


Hs.9216 


132736 


U68019 


AW081883 


Hs.288261 


sapiens 


mad protein homolog (hMAD-3) mRNA 


133070 


U59611 


U92649 


Hs.64311 


102663 


U70322 


NM 002270 


Hs.168075 


134660 


U73524 


U73524 


Hs.87465 


102735 


U79267 


AF111106 


Hs.3382 


102741 


U79291 


AW959829 


Hs.83572 


101175 


U82671 cds2 


U82671 


Hs.36980 


132164 


U84573 


AI752235 


Hs.41270 


102823 


U90914 


D85390 


Hs.5057 


102826 


U91316 


NM_007274 


Hs.8679 


102831 


U91932 


AA262170 


Hs.80917 


102846 


U96131 


BE264974 


Hs.6566 


129777 


U97018 


U97018 


Hs.12451 


134161 


U97188 


AA634543 


Hs.79440 


134854 


V00503 


J03464 


Hs.179573 


302363 


X04327 


AW1 63799 


Hs. 198365 


133708 


X06389 


AI018666 


Hs.75667 


125701 


X07496 


T72104 


Hs.93194 


102915 


X07820 


X07820 


Hs.2258 


134656 


X14787 


AI750878 


Hs.87409 


413858 


X15525 ma1 


NM_001610 


Hs.75589 


102968 


X16396 


AU076611 


Hs.154672 


cyclohydrolase 






102971 


X16609 


X16609 


Hs.183805 


134037 


X53586_rna1 


AI808780 


Hs.227730 


103023 X53793 


AW500470 


Hs.1 17950 


103037 


X54936 


BE018302 


Hs.2894 


130282 


X55740 


BE245380 


Hs.153952 


134542 X57025 


M14156 


Hs.85112 


128568 X60673_rna1 


H12912 


Hs.274691 


103093 


X60708 


S79876 


Hs.44926 


133506 


X62048 


U10564 


Hs.75188 


129063 


X63097 


X63094 


Hs.283822 


424460 


X63563 


BE275979 


Hs.296014 


133227 


X64037 


AW977263 


Hs.68257 


103181 


X69636 


X69636 


Hs.334731 


103184 


X69878 


U43143 


Hs.74049 


103194 


X70649 


NIVL004939 


Hs.78580 



damage-specific DNA binding protein 2 (48kD) 

nuclear respiratory fector 1 

caspase 6, apoptosis-rslated cysteine protease 

branched chain aminotransferase 1, cytosolic 

eul aryotic f anslat n in tiation factor 2B, sirbunit 5 (epsilon, 82kD) 

SRY (sex determining region Y)-box 1 1 

CCCTC-binding factor (zinc finger protein) 

stanniocalcin 1 

ESTs, Highly similar to Z169_HUMAN ZINC FINGER PROTEIN 169 [H.sapiens] 
KIAA1641 protein 

gb:Human myelomonocytic specific protein (MNDA) gene, 5' flanking sequence and complete 
syntaxin3A 

regulator of G-protein signalling 7 
N-myc (and STAT) interactor 
necdin (mouse) homolog 

eukaryotic translation initiation factor 3, submit 2 (beta, 36kD) 

chromosome 11 open reading frame 4 

mitogen-activated protein kinase kinase 6 

proline arginine-rich end leucine-rich repeat protein 

a disintegrin and metalloproteinase domain 9 (meltrin gamma) 

homeoboxA9 

DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 10 (RNA helicase) 
selenophosphate synthetase 2 

MAD (mothers against decapentaplegic, Drosophila) homolog 4 

small nuclear RNA activating complex, polypeptide 1, 43kD 

fibroblast growth factor 8 (androgen-induced) 

protein kinase, DNA-activated, catalytic polypeptide 

protein kinase C binding protein 1 

Human BRCA2 region, mRNA sequence CG006 

von Hippel-Lindau binding protein 1 

cullin 4B 

cyclic nucleotide gated channel beta 1 
cadherin 13, H-cadherin (heart) 
TRAF family member-associated NFKB activator 
ubiquitin-like 1 (sentrin) 

caspase 7, apoptosis-related cysteine protease 

Homo sapiens cDNA: FLJ23037 lis, clone LNG02036, highly similar to HSU68019 Homo 

a disintegrin and metalloproteinase domain 17 (tumor necrosis factor, alpha, converting enzyme) 

karyopherin (importin) beta 2 

ATP/GTP-binding protein 

protein phosphatase 4, regulatory subunit 1 

hypothetical protein MGC14433 

melanoma antigen, family A, 2 

procollagen-lysine, 2-oxoglutarate 5-dioxygenase (lysine hydroxylase) 2 
carboxypeptidase D 

cytosolic acyl coenzyme A thioester hydrolase 
adaptor-related protein complex 3, sigma 1 subunit 
thyroid hormone receptor interactor 13 
Dchinoderm microtubuie-associated protein-like 
IGF-II mRNA-binding protein 3 
collagen, type I, alpha 2 
2,3-bisphosphoglycerate mutase 



apolipoprotein A-l 



Ihrombospondin 1 

acid phosphatase 2, lysosomal 

methylene tetrahydrofolate dehydrogenase (NAD+ depend 



ankyrin 1 , erythrocytic 
integrin, alpha 6 

multifunctional polypeptide similar to SAICAR synthetase and AIR carboxylase 

placental growth factor, vascular endothelial growth factor-related protein 

5' nucleotidase (CD73) 

insulin-like growth factor 1 (somatomedin C) 

adenylate kinase 3 

dipeptidylpeptidase IV (CD26, adenosine deaminase complexing protein 2) 

wee1+(S. pombe) homolog 

Rhesus blood group, D antigen 

polymerase (RNA) II (DNA directed) polypeptide B (140kD) 

general transcription factor IIF, polypeptide 1 (74kD subunit) 

Homo sapiens, clone !MAGE:3448306, mRNA, partial cds 

fms-related tyrosine kinase 4 

DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 1 
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103208 X72841 


AW411340 


Hs.31314 


129698 X74987 


BE242144 










130729 X84194 


AI963747 


Hs. 18573 


1 03334 X85753 


NM 001260 


Hs 25283 


132645 X87870 

1 wnoA YftQnfiR 
loouaq AoyuDD 

103352 X89398__cds2 

103353 X89399 


Al 6547 12 
NM 003304 


Hs.54424 
Hs.250687 
Hs 78853 
Hs. 119274 


132173 X89426 


YAQ49K 




103371 X91247 




Hs 13046 


131584 X91648 






1 03376 X92098 




Hs 323378 


103378 X92110 


AL1 19690 


Hs.153618 






Hs 296371 






Hs 334879 


133490 X97230 f 


AF022044 


Hs 274601 


103438 X98263 


AW1 75781 


Hs 152720 




X98296 




1 03452 X99584 






1 33536 Y00264 




) Hs 177486 


135185 Y07566 






118523 Y07759 






134662 Y07827 






1 32083 Y07867 






1 03500 Y09443 






134389 Y09858 


Y09858 




132084 Y12394 


NMJ02267 


Hs.3886 


103540 211559 


NM 002197 




133152 Z11695 


Z11695 




103548 Z15005 


Z15005 




103612 Z46261 


BE336654 




129092 AA011243.S 


D56365 


Hs 63525 


103692 AA018418 


AW137912 


Hs.227583 


(CACNA1F) gene, complete cds; HSP27 [ 


iseudogene, ( 


103695 AA018758 


AW207152 


MS. IOOOUU 


129796 AA018804 


BE218319 




132258 AA031993 


AA305325 




132683 AA044217 


BE264633 


Hs 143638 


131887 AA046548 


W17064 


Hs.332848 


member 1 






103723 AA057447_s 


BE274312 


Hs 214783 


453368 AA058376 


W20296 


Hs.288178 


133260 AA083572 


AA403045 




103765 AA085696 


AA085696 


Hs.169600 


103766 AA088744 


AI920783 


Hs 191435 


103767 AA089688 


BE244667 


Hs. 296155 


132051 AA091284 


AA393968 


Hs 180145 


103773 AA092700 


AI219323 


Hs 101077 


[C.elegans] 






135289 AA092968 


AW372569 




132729 AA094800 


AW970843 


Hs.55682 


103794 AA100219 


AF244135 


Hs.30670 


131471 AA114885 


AA164842 




134319 AA129547 


BE304999 




103807 AA133016 


AW958264 


Hs 103832 


119159 AA149507 


AF142419 


Hs. 15020 


129863 AA151005 


BE379765 


Hs. 129872 


103850 AA187101 


AA187101 


Hs 213194 


103855 AA195179 s 
322026 AA203138 


W02363 
AW024973 


Hs 302267 
Hs.283675 


135300 AA203645 


AA142922 


Hs.278626 


103861 AA206236 


AA206236 




130634 AA227621 


AI769067 


Hs 127824 


[Celegans] 






447735 AA248283 


AA775268 


H 6127 


103909 M249611 


AA249611 




131236 AA282640 


AF043117 


Hs 24594 


134060 AA287199 


D42039 


Hs78871 


129013 AA313990 


AA371155 


Hs.1 07942 


129435 M314256 


AF151852 


Hs.1 11449 


103988 AA314389 


AA314389 


Hs.42500 


104000 AA324364 


AI146527 


Hs.80475 


425284 AA329211 s 


AF155568 


Hs.155489 


128629 AA399187 


AL096748 


Hs.1 02708 


133281 AA421079 


AK001601 


Hs.69594 



BMXm 

acylphosphatase 1, erythrocyte (common) type 
cyclin-dependent kinase 8 
hepatocyte nuclear factor 4, alpha 
transient receptor potential channel 1 
uracil-DNA glycosylase 

RAS p21 protein activator (GTPase activating protein) 3 (lns(1,3,4,5)P4-binding protein) 
endothelial cell-specific molecule 1 
thioredoxin reductase 1 
purine-rich element binding protein A 
coated vesicle membrane protein 
HCGVIII-1 protein 

RAB23, member RAS oncogene family 
DR1 -associated protein 1 (negative cofactor 2 alpha) 
killer cell immunaglobulin-like receptor, three domains, long cytoplasmic tail, 1 
M-phase phosphoprotein 6 

ubiquitin specific protease 9, X chromosome (Drosophila fat facets related) 
SMT3 (suppressor of mif two 3, yeast) homolog 1 
amyloid beta (A4) precursor protein (protease nexin-ll, Alzheimer disease) 
Ric (Drosophiia)-like, expressed in many tissues 
myosin VA (heavy polypeptide 12, myoxin) 
butyrophilin, subfamily 3, member A1 
Pirin 

alkylglycerone phosphate synthase 
spindlin-like 

karyopherin alpha 3 (importin alpha 4) 
aconitasel, soluble 
mitogen-activated protein kinase 1 
centromereproteinE(312kD) 
H3 histone family, member A 
poly(rC)-blnding protein 2 

Homo sapiens chromosome X map Xp1 1 .23 L-type calcium channel alpha-1 subunit 
complete sequence; and JM1 protein, JM2 protein, and Hb2E genes, complete cds 
ESTs 

GTPase Rab14 

SUMO-1 activating enzyme subunit 2 
WD repeat domain 4 

SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily e, 

Homo sapiens cDNA FLJ14041 fis, clone HEMBA1005780 
Homo sapiens cDNA FLJ1 1968 fis, clone HEMBB1001133 
Homo sapiens cDNA: FLJ23197 fis, clone REC00917 
KIAA0826 protein 
ESTs 

CG1-100 protein 
HSPC030 protein 

ESTs, Weakly similar to T22363 hypothetical protein F47G9.4 - Caenorhabditis elegans 

hypothetical protein MGC10924 similar to Nedd4 WW-binding protein 5 
eukaryotic translation initiation factor 3, subunit 7 (zeta, 66/67kD) 
hepatocellular carcinoma-associated antigen 66 
KIAA16C0 protein 
fumarate hydratase 
similar to yeast Upf3, variant B 

homolog of mouse quaking OKI (KH domain RNA binding protein) 
sperm associated antigen 9 
hypothetical protein MGC10895 
hypothetical protein FLJ10330 
NPD009 protein 

Arg/Abl-interacting protein ArgBP2 
hypothetical protein FU12783 

ESTs, Weakly similar to T28770 hypothetical protein W03D2.1 

Homo sapiens cDNA: FLJ23020 fis, clone LNG00943 
SH3 domain binding glutamic acid-rich protein 
ubiqultination factor E4B (homologous to yeast UFD2) 
mesoderm development candidate 2 
DKFZP564M1 12 protein 
CGI-94 protein 
ADP-ribosylation factor-like 5 

polymerase (RNA) II (DNA directed) polypeptide J (13.3kD) 
' NS1-associated protein 1 
DKFZP434A043 protein 
high-mobility group 20A 
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104104 


AA422029 


AA422029 


Hs.143640 


[H .sapiens] 






108154 


AA425230 


NM 005754 


Hs.220689 


132091 


AA447052 


AW954243 


Hs.170218 


135073 


AA452000 


W55956 


Hs.94030 


131367 


AA456687 


AI750575 


Hs. 173933 


129593 


AA487015_s 


AI338247 


Hs.98314 


135266 


AB002326 


R41179 


Hs.97393 


133505 


C01527 


AI630124 


Hs.324504 


132064 


C01714 


AA121098 




134393 


C01811 f 


W52642 


Hs.8261 


131427 


C02352_s 


AF151879 


Hs.26706 


133435 


C02375 


AI929357 


Hs.323966 


104282 


C14448 


C14448 


Hs.332338 


134827 


D16611_s 


BE314037 


Hs.89866 




□2521 6 


D25216 


Hs.155650 


131742 


D31352 


AA961420 


Hs.31433 


132837 
130377 


D58024_s 
D80897 


M370362 
NM_014909 


Hs.57958 
Hs.155182 


104334 


D82614 


D82614 


Hs.78771 


134593 


D87845 


NMJ00437 


Hs.234392 


134731 


D89377J 


D89377 


Hs.89404 


129913 


H06583 


NM_001310 


Hs.13313 


131670 


H40732 


H03514 


Hs.10130 


104394 


H46617 


M1 29551 


Hs.172129 


104402 


H56731 


H56731 


Hs. 132956 


129781 


H75570 


AA306Q90 


Hs.124707 


129077 


H78886 


N74724 


Hs.108479 


104417 


H81241 


AI819448 


Hs.320861 


134927 


L36531 


L36531 


Hs.91296 


129280 


M63154 


M63154 


Hs.110014 


134498 


M63180 


AW246273 


Hs.84131 


104460 


M91504 


AW955705 


Hs.62604 


104488 


N56191 


N56191 


Hs.106511 


131248 


N78483 


AI038989 


Hs.332633 


129214 


N79268 


AL044335 


Hs.109526 


130017 


R14652 


AK00G096 


Hs.143198 


104530 


R20459 


AK001676 


Hs.12457 


104534 


R22303 


R22303 




sequence. 






104544 


R33779 


AI091173 


Hs.222362 


133328 


R36553 


AW452738 


Hs.265327 


104567 


R64534 


AA040620 


Hs.5672 


128562 


R66475 


AA923382 


Hs.101490 


129575 


R70621 


F08282 


Hs.278428 


130776 


R79356 


AF167706 


Hs.19280 


104599 


R84933 


AW815036 


Hs.151251 


104660 RC_AA007160 


BE298665 


Hs.14846 


104667 RC_AA007234_s AI239923 


Hs.30098 


104718 RC_AA018409 


AI143020 


Hs.36250 


104764 


RC AA025351 


AI039243 


Hs.278585 


104786 


RCJ\A027168 


AA027167 


Hs.10031 


104787 


RC AA027317 


AA027317 




similar to contains Alu repetitive element; 


, mRNA sequer 


134079 


RC_AA029423 


AK001751 


Hs.171835 


104804 RC_AA031357 


AI858702 


Hs.31803 


104865 


RC AA045136 


T79340 


Hs.22575 


130828 


RC AA053400 


AW631469 


Hs.203213 


104907 


RC AA055829 


AA055829 


Hs.196701 


WARNING ENTRY [H.sapiens] 




104943 


RC_AA065217 


AF072873 


Hs.1 14218 


105013 


RC M1 16054 


H63789 


Hs.296288 


105024 


RC AA126311 


AA1 26311 


Hs.9879 


132592 


RC AA129390 


AW803564 


Hs.288850 


105038 


RC AA130273 


AW503733 


Hs.9414 


105077 


RC AA142919 


W55946 


Hs.234863 


105096 


RC AA150205 


AL04250S 


Hs.21599 


129215 


RC AA176867 


AB040930 


Hs.1 26085 


105169 


RC AA180321 


BE245294 


Hs 180789 


132796 


RC_AA180487 


NM_006283 


Hs'.173159 


130401 


RC AA187634 


BE396283 


Hs.173987 


105200 


RC AA195399 


AA328102 


Hs.24641 


130114 


RC AA234717 


AA233393 


Hs.14992 


105330 


RC AA234743 


AW338625 


Hs.22120 


105337 


RC_AA234957 


AI468789 


Hs.23200 


129385 


RC AA235604 


AA172106 


Hs.110950 



Hs.143640 ESTs, Weakly similar to hyperpolarization-activated cyclic nucleotide-gated channel hHCN2 

Ras-GTPase-activating protein SH3-domain-binding protein 
KIM0251 protein 

Homo sapiens mRNA; cDNA DKFZp586E1624 (from clone DKFZp586E1624) 
nuclear factor l/A 

Homo sapiens mRNA; cDNA DKFZp586L0120 (from clone DKFZp586L0120) 
KIAA0328 protein 

Homo sapiens mRNA; cDNA DKFZp586J0720 (from clone DKFZp586J0720) 
serum-inducible kinase 
hypothetical protein FLJ22393 
CGI-121 protein 

Homo sapiens clone H63 unknown mRNA 
EST 

coproporphyrinogen oxidase (coproporphyria, harderoporphyria) 
KIAA0014 gene product 
ESTs 

EGF-TM7-latrophilin-related protein 
KIAA103S protein 
phosphogiycerate kinase 1 
platelet-activating factor acetylhydrolase 2 (40kD) 
msh (Drosophila) homeo box homolog 2 
cAMP responsive element binding protein-like 2 
ESTs 

Homo sapiens cDNA; FLJ21409 fis, clone COL03924 
ESTs 
ESTs 
ESTs 

Kruppel-like factor 8 
integrin, alpha 8 

gastric intrinsic factor (vitamin B synthesis) 
threonyl-tRNA synthetase 

Homo sapiens, clone IMAGE:4299322, mRNA, partial cds 
protocadherin 17 



zinc finger protein 198 

inhibitor of growth family, member 3 

hypothetical protein FLJ10814 

gb:yh26b09.r1 Soares placenta Nb2HP Homo sapiens cDNA clone IMAGB130841 5', rr 



hypothetical protein DKFZp761l141 
hypothetical protein AF140225 
ESTs 

progestin induced protein 
cysteine-rich motor neuron 1 
ESTs 

Homo sapiens mRNA; cDNA DKFZp564D016 (from clone DKFZp564D016) 
ESTs 

ESTs, Weakly similar to I38022 hypothetical protein [H.sapiens] 



hypothetical protein FU108B9 

ESTs, Weakiy similar to N-WASP [H.sapiens] 

B-cell CLUIymphoma 6, member B (zinc finger protein) 

ESTs 

ESTs, Weakiy similar to ALULHUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 

frizzled (Drosophila) homolog 6 

ESTs, Weakly similar to KIAA0638 protein [H.sapiens] 

ESTs 

Homo sapiens cDNA: FLJ22528 fis, clone HRC12825 
KIAA1488 protein 

Homo sapiens cDNA FLJ12082 fis, clone HEMBB1002492 
Kruppel-like factor 7 (ubiquitous) 
KIAA1497 protein 
S164 protein 

transforming, acidic coiled-coil containing protein 1 
eukaryofic translation initiation factor 3, subunit 1 (alpha, 35kD) 



hypothetical protein FU11 151 
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131962 
131991 
128658 



135172 
131569 
132542 
105643 



15 105674 
105709 
105722 
105765 



RC_AA235559 

RC AA242868 

RC AA251776 

RC AA251909 

RC_M252672_s 

RC AA255157 

RC AA256680 

RCJ\A258873 

RC AA262727 

RC_AA281451 

RC_AA281545 

RC_AA282069 

RC_AA283044 

RO.AA283930 

RC_AA284755 

RC_AA291268 

RC_AA291927 

RCJW343514 

RC_AA398109 

RC_AA405737 

RC_AA406610 



AW994032 Hs.8788 
AA814807 Hs.7395 
AK00004S Hs.267448 
Hs.36708 
Hs.324830 
Hs.24115 
Hs.326416 
Hs.109694 
Hs.12144 
Hs.271623 
Hs.263671 
Hs.173802 
Hs.25625 
Hs.34906 
AI609530 Hs.279789 
AI928962 Hs.26761 



AL137751 
BE621719 
AA283044 
AA426234 



106008 RC_AA411465 

131216 RC_AA416886 

134222 RC.AA424013 

113689 RC_AA424148 

108141 RC AA424558 

130839 RclAA424961_s 

106157 RC.AA425367 

130777 RC_AA425921 
receptor 

130561 RC_AA426220 

106196 RC.AA427735 
WARNING 

131878 RC_AA430S73 

133200 RC_AA432248 

106302 RC.AA435896 

106328 RC_AA436705 

450534 RC_AA446561 

106423 RC_AA448238 

133442 RC_AA448688 

439608 RC.AA449756 

106477 RC_AA450303 

106503 RC_AA452411 

446999 RC_AA454566 

106543 RC_AA454657 

130010 RC_AA456437 

106589 RC_AA456646 

106593 RC_AA456826 

106596 RC_AA456981 
CONTAMINATION 

134655 RC_AA458959 
member 1 

106636 RC_AA459950 

106654 RCJ\A460449 

131353 RC_AA463910 

106707 RC_AA464603 

131710 RC_AA464606 

106717 RC_AA465093 

131775 RC_AA465692 

106747 RC_AA476473 

106773 RC_AA478109 

106781 RC_AA478474 

106817 RC_AA480889 

106846 RC.AA485223 

106848 RC_AA485254 

106856 RC_AA486183 



AI570189 
AB020722 
AL137663 



AA151520 
AA676939 
AA301116 



hypothetical protein FLJ 10849 
hypothetical protein FLJ23182 
hypothetical protein FLJ20039 

budding uninhibited by benzimidazoles 1 (yeast homolog), beta 

diptheria toxin resistance protein required for diphlhamide biosynthesis (Saccharomyces)-like 2 

Homo sapiens cDNA FU14178 fis, clone NT2RP2003339 

Homo sapiens mRNA; cDNA DKFZp564H1916 (from clone DKFZp564H1916) 

KIAA1451 protein 

KIAA1033 protein 

nuclaoporin 50kD 

Homo sapiens mRNA; cDNA DKF2p434l0812 (from clone DKFZp434l0812); partial cds 
KIAA0603 gene product 
hypothetical protein FU1 1323 

ESTs, Weakly similar to T17210 hypothetical protein DKFZp434N041.1 [H.sapiens] 



Hs.8619 
Hs.243901 
Hs.8025 
Hs.16621 
AF031463 Hs.9302 
Hs.20141 
Hs.34892 
Hs.285418 



Hs.28020 

Hs.25132 

Hs.16714 

Hs.7378 

Hs.301732 

Hs.41693 

Hs.29679 

Hs.334822 

Hs.69285 

Hs. 142838 

Hs.28661 

Hs.24605 

Hs.293552 



AW075485 
AW754182 
AK000566 
NMJJ15368 
AA600357 



NMJ07118 

AA478109 

AA330310 

D61216 

AB037744 

AA449014 

W58353 



Hs.98135 
Hs.30985 
Hs.239489 
Hs.31921 
Hs.171957 



Hs,24181 
Hs.18672 
Hs.34892 
Hs. 121025 



418699 RC_AA496936 BE539639 Hs.173030 



WARNING 

107001 RC_AA598589 AI926520 

130638 RC_AA598831_f AW021276 

107054 RC_AA600150 AI076459 

107059 RC_AA608545 BE614410 

107080 RC_AA609210 AL122043 

107115 RC_AA610108 BE379623 

107130 RC_AA620582 AB033106 



Hs.31016 
Hs.17121 
Hs.15978 
Hs.23044 
Hs.19221 
Hs.27693 
Hs.12913 



DKFZP586L0724 protein 

ESTs 

ESTs 

sec13-like protein 
hypothetical protein FLJ10120 

gb:zv15b10.s1 Soares_NhHMPu_S1 Homo sapiens cDNA clone IMAGE:753691 3' similar to 

SRY (sex determining region Y)-box 18 

Homo sapiens cDNA FLJ20738 fis, clone HEP08257 

Homo sapiens clone 23767 and 23782 mRNA sequences 

DKFZP434I116 protein 

phosducin-like 

similar to S, cerevisiae SSM4 
KIAA1323 protein 

Homo sapiens CDNAFU1 0643 fis, clone NT2RP2005753, highly similar to Homo sapiens 1-1 



hypothetical protein MGC3178 
hypothetical protein FLJ10210 
hypothetical protein FLJ23221 
KIAA0766 gene product 
KIAA0470 gene product 
Rho guanine exchange factor (GEF) 15 

Homo sapiens mRNA; cDNA DKFZp434G227 (from clone DKFZp434G227) 

hypothetical protein MGC5306 

DnaJ (Hsp40) homolog, subfamily B, member 4 

cofactor required for Sp1 transcriptional activation, subunit 3 (130kD) 

hypothetical protein MGC4485 

neuropilin 1 

nucleolar pnosphoprotein Nopp34 

Homo sapiens cDNA FLJ10071 fis, clone HEMBA1001702 

ESTs 

ESTs, Moderately similar to ALU7_HUMAN ALU SUBFAMILY SQ SEQUENCE 

SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily f, 

ribosomal protein L4 
phosphoserine aminotransferase 

gb:RC2-CT0321-131199-011-c01 CT0321 Homo sapiens cDNA, mRNA sequence 
hypothetical protein FLJ20559 
pannexin 1 

TIA1 cytotoxic granule-associated RNA-binding protein 
KIAA0648 protein 

triple functional domain (PTPRF interacting) 

ESTs 

ESTs 

ESTs 

KIAA1323 protein 

chromosome 1 1 open reading frame 5 

Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 2005779 

ESTs, Weakly similar to ALUB.HUMAN ALU SUBFAMILY SX SEQUENCE CONTAMINATION 

putative DNA binding protein 
ESTs 

KIAA1272 protein 

RAD51 (S. cerevisiae) homolog (E coli RecA homolog) 
hypothetical protein DKFZp566G1424 
peptidylprolyl isomerase (cyclophilin)-like 1 
KIAA1280 protein 
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107156 RCLAA621239 
107174 RC_AA621714 
130621 RC_AA621718 
107190 RC_D19673 
132626 RC_D25755_s 
107217 RC_D51095 
, 131610 RC_D60272_i 
129604 T08879 
107295 T34527 
(GalNAo-TI) 
107299 T40327_s 

107315 T62771_s 

107316 T63174_s ' 
107328 T83444 
107334 T93641 
134715 U48263 
128636 U49065 
129938 U79300 
107375 1)88573 
130074 U93867 
107387 W01094 
132036 W01568 
107426 W26853 
113857 W27179 
135388 W27965 
130419 W36280_s 
107469 W47063 
132616 W79060 
107506 W88550 
132358 X60486 
107522 X78931_s 
125827 Z14077.S 
107582 RC_AA002147 
107609 RC_AA004711 
107661 RC_AA010383 
107714 RC_AA015761 
107775 RC_AA018772 
107832 RC„AA021473j 
sequence. 

107859 RCAA024835 
124337 RC_AA025858 
107914 RC.AA027229 
[C.elegans] 

107935 RC.AA029428 
116262 RC AA035143 
131461 RC_AA035237 
108007 RC_M039347 
108029 RCjAA040740 
108040 RC_AA041551 
member 1 

108084 RC AA045513 
108088 RC_AA045745 
108168 RC_AA055348 
130719 



AA137043 
BE122762 
AW513087 
AA836401 
AW504732 
AL080235 
AA357879 



BE277457 
AA316241 
T63174 



Hs.25338 
Hs.16803 
Hs.5103 
Hs.21275 
Hs.35861 
Hs.29423 
Hs.11590 
Hs.80120 

Hs.30661 
Hs.90691 
Hs.1 93700 



Hs.135587 
Hs.251064 
Hs.250745 
Hs.1 18893 
AL157433 Hs.37706 
W26853 Hs.291003 
AW243158 Hs.5297 
W27965 Hs.99865 
AF037448 Hs.155489 
W47063 Hs.94668 
BE262677 Hs.283558 
AB028981 Hs.8021 
NM_003542 Hs.46423 
X78931 Hs.99971 
NM 003403 
AA002147 
R75654 
AA010383 
AA015761 Hs.60642 
AW008846 Hs.60857 
AA021473 



Hs.97496 
Hs.59952 
Hs.164797 
Hs.60389 



AW732573 

N23541 

M027229 



Hs.47584 
Hs.281561 
Hs.61329 



Hs.61916 
Hs.62007 
Hs.159971 



108190 RC_AA056746 

108203 RC_AA057678 

108216 RC_AA058681 

108217 RC_AA058686 
108245 RC_AA062840 
108277 RC_AA064859 
mRNA 

108280 RC_AA065069 



AA058944 
AA045745 
AI453137 
AA679262 
AW376061 
AA056746 
AW847814 
AA524743 
AA058686 
BE410285 



133739 RC_AA070799_s BE536554 

108340 RC_AA070815 AA069820 

108403 RC_AA075374 AA075374 

3', mRNA sequence. 

108427 RC_AA076382 AA076382 

3', mRNA sequence. 

108435 RC_AA078787 T82427 

108439 RC_AA078986 AA078986 

3', mRNA sequence. 

108465 RC_AA079393 M079393 

108469 RC_AA079487 AA079487 

sequence 



Hs.63335 
Hs.63338 
Hs.289005 
Hs.44883 
Hs.62588 
Hs.89545 



Hs.194101 
Hs.3462 



programmed cell death 6-interacting protein 
ESTs 

LUC7 (S. cerevisiae)-liks 
ESTs 

hypothetical protein FLJ11011 
DKFZP586E1621 protein 
scavenger receptor with C-type lectin 
cathepsin F 

UDP-N-acetyl-alpha-D-galactosamine: polypeptide N-acetylgalado 



hypothetical protein MGC4606 
nucleophosmin/nudeoplasmin 3 

Homo sapiens mRNA; cDNA DKFZp586l0324 (from clone DKFZp586l0324) 

KIAA0887 protein 

ESTs 

prepronociceptin 

interieukin 1 receptor-like 2 

Human clone 23629 mRNA sequence 

high-mobility group (nonhistone chromosomal) protein 14 

polymerase (RNA) 111 (DNA directed) (62kD) 

Melanoma associated gene 

hypothetical protein DKFZp434E2220 

hypothetical protein MGC4707 

DKFZP564A2416 protein 



ESTs 

hypothetical protein PR01855 

KIAA1058 protein 

H4 histone family, member G 

zino finger protein 272 

YY1 transcription factor 

EST 

hypothetical protein FLJ 13693 

ESTs 

ESTs 

ESTs 

gb:zeS5d 1 .s1 Soares retina N2b4HR Homo sapiens cDNA clone IMAGE:363956 3', mRNA 

potassium voltage-gated channel, delayed-rectifier, subfamily S, member 3 
Homo sapiens cDNA: FLJ23582 fis, clone LNG13759 

ESTs, Weakly similar to T1 6370 hypothetical protein F45E12.5 - Caenorhabditis elegans 



AA029428 Hs.61555 ESTs 



SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily b, 

Homo sapiens, clone IMAGE:4154008, mRNA, partial cds 

ESTs 

ESTs 

hypothetical protein FLJ20008; KIAA1839 protein 
ESTs, Moderately similar to A4601O X-linked retinopathy protein [H.sapiens] 
EST 



gb:zm12e1 1 .s1 Stratagene pancreas (937208) Homo sapiens cDNA clone 3', mRNA sequence 
gb:zm67eQ3,r1 Stratagene neuroepithelium (937231) Homo sapiens cDNA clone 5' similar to 
unaotive progesterone receptor, 23 kD 
peroxiredoxin 1 

gb:zm87a01 ,s1 Stratagene ovarian cancer (937219) Homo sapiens cDNA clone IMAGE:544872 



gb:zm91g08.s1 



;er (937219) Homo sapiens cDNA clone IMAGE:545342 



Homo sapiens cDNA: FLJ20869 fis, clone ADKA02377 

gb:zm92h01 ,s1 Stratagene ovarian cancer (937219) Homo sapiens cDNA clone IMAG&545425 
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AA100796 

AI732404 

AI907537 

D14041 

AW067800 

AA127017 

AI652236 



108500 RC_AA083207 

108501 RC_AA083256 
gb:M33308 

108533 RCLAA084415 
mRNA 

108562 RO.AA085274 
gb:X15341 

108589 RC_AA088678 

130890 RC AA100925 

134585 ROAA101255 

130385 RC_AA126474 

108749 RC AA127017 

108807 RC AA1 29968 

108808 RC_AA130240 
108833 RC AA131866 
107290 RC_AA1 32039 
108846 RC_AA132983 
108857 RC_AA133250 
131474 RC_AA133583 
108894 RC_AA135941 
108941 RO.AA148650 
IMAGE:567202 3', 
108968 RC_AA151110 
108996 RC_AA155754 
109001 RC_AA156125 
131183 RC.AA156289 
109019 RC.AA156997 

109022 RC.AA157291 

109023 RC_AA1 57293 
109068 RC AA164293. 
109072 RC_AA1 64676 
129021 RCAA1 67375 
130346 RC_AA167550 
109146 RC.M176589 
109172 RC_AA180448 
131080 RC_AA187144 
129208 RC M189170 
109222 RCAA192757 
109300 RC_AA205650 AA418276 



AA083207 Hs.68270 



Hs.6£ 



AF188527 
W27740 
AL1 17452 
AK001468 
s L45353 
AK001431 
AA148650 



Hs.76698 
Hs.278573 
Hs.155223 
Hs.71052 
Hs.49376 
Hs.62738 
Hs.61661 
Hs.323780 
Hs.44155 
Hs.62180 
Hs.2726 
Hs.5105 



EST 

gb:zn08g12.s1 Stralagene hNT neuron (937233) Homo sapiens cDNA clone 3' similar to 
gb:zn06g09.s1 Stralagene hNT neuron (937233) Homo sapiens cDNA clone IMAGE:546688 3', 
gb:zm26c06.s1 Stratagene pancreas (937208) Homo sapiens cDNA clone 3' similar to 
ESTs 

stress-associated endoplasmic reticulum protein 1; ribosome associated membrane protein 4 

H-2K binding factor-2 

stanniocalcin2 

ESTs 

hypothetical protein FLJ20644 
ESTs 

ESTs, Weakly similar to AF174605 1 F-box protein Fbx25 [H.sapiens] 
ESTs 

DKFZP586G1517 protein 

anillin (Drosophila Scraps homolog), actin binding protein 
high-mobility group (nonhistone chromosomal) protein isoform l-C 
hypothetical protein FLJ 10569 

gb:zo09e06.s1 Stratagene neuroepithelium NT2RAMI 937234 Homo sapiens cDNA clone 



AI304870 Hs.188680 ESTs 



AI056548 
AI611807 
AA156755 
AA157291 
AA157293 
f AA164293 
AI732585 
AL044675 
H05769 



AA180448 
s NMJ01955 
f AI587376 



110107 
110155 

60 110197 
110223 
110306 
110335 
110342 

65 110395 
110511 
110523 
110715 
110754 

70 130132 
131135 
134263 
110938 
110983 

75 115062 
111081 



1 RC_AA233342 
5 RC AA233472 
3 RC.AA234110 
7 RC D80981 
5 RC.F01660 
7 RC_F02206 
3 RC.F02208 
5 RC_F02544 
5 RC.F03918 
3 RC_F04258_s 
3 RC_F04600 
1 RC_F08998 
9 RC F09605 
3 RC_F11115 

3 RC_H0S371 

4 RCJH10995 
3 RC H11938 

RCJH16568 
RC_H16772 
RC_H18951 
RC_H20859 
RC_H23747 
RCH38087 
RC_H40331 
RC_H40567 
RC_H46966 
RC_H56640_i 
RC_H57154 
RC_H96712 



Hs.72116 

Hs.285107 

Hs.72150 

Hs.21479 

Hs.72168 

Hs.72545 

Hs.22394 

Hs.173081 

Hs.188757 

Hs.142078 

Hs.144300 

Hs.2271 

Hs.109441 

Hs.333512 

Hs.170142 



AF1 19555 

H17800 

R59210 

H18013 

AW016809 

R52417 

AL109666 

H11938 

R44557 

AW151660 

AI559626 

AW090386 

H19836 

H38087 

H65490 

H40961 

AA025116 



RC_N 
RC_N25249 
RC_N27100 
RC_N39616 
RCJJ48982 
RC_N51957 
RC_N52271 
RCJM59435 



NM.015367 
AA253314 
AI146349 



Hs.71913 

Hs.34898 

Hs.87385 

Hs.296639 

Hs.27214 

Hs.27301 

Hs.22697 

Hs.184011 

Hs.7154 

Hs.26634 

Hs.167483 

Hs.323795 

Hs.20945 

Hs.7242 

Hs.21907 

Hs.23748 

Hs.31444 

Hs.93522 

Hs.112278 

Hs.31697 

Hs.105509 

Hs.18845 

Hs.33008 

Hs.33333 

Hs.221460 

Hs.19102 



Hs.184376 

Hs.267182 

Hs.8086 

Hs.38034 

Hs.10267 

Hs.154103 

Hs.271614 



hypothetical protein FLJ20992 similar to hedgehog-interacting protein 

hypothetical protein FLJ 13397 

ESTs 

ubinuclein 1 

ESTs 

ESTs 

hypothetical protein FLJ 10893 
KIAA0530 protein 

Homo sapiens, clone MGC:5564, mRNA, complete cds 



1 

MSTP033 protein 
similar to rat myomegalin 
ESTs 

hypothetical protein FLJ21016 

Homo sapiens cDNA: FLJ21869 fis, clone HEP02442 

ESTs 

ESTs 

ESTs 

Homo sapiens potassium channel subunit (HERG-3) mRNA, complete cds 



ESTs 

pyrophosphatase (inorganic) 

ESTs 

ESTs 

ESTs 

ESTs 

Homo sapiens clone 24993 mRNA sequence 

Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 35 

Mstone acetyltransferase 

ESTs 

ESTs 

Homo sapiens mRNA for KIAA1647 protein, partial cds 

arrestin, beta 1 

ESTs 

CTL2 gene 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs, Weakly similarto organic anion transporter 1 [H.sapiens] 
ESTs 

KIAA0672 gene product 
synaptosomal-associated protein, 23kD 
TBX3-iso protein 

RNA (guanine-7-) methyltransferase 

Homo sapiens cDNA FLJ12924 fis, clone NT2RP2004709 
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111128 RC_N64139 
135244 RC N66981 
111216 RCJM68640 
437562 RC.N69352 
131002 RCJJ95226 
111399 RC_R0O138 
111514 RC_R07998 
similar to 

130182 RC_R08929 
111574 RC_R10307 
111804 RC.R33354 
111831 RC_R36083 
129675 RC_R37938J 
111904 RC_R39330 
sequence 

133868 RC_R40816_s 
112033 RC_R43162_s 



112300 RC.R54554 

112513 RC_R68425 

112514 RC_R68568 



AW505364 
AI834273 
AW139408 
AB001636 
AL050295 
AW270776 



BE267033 
AI024145 
AA482478 



Hs.19074 
Hs.9711 
Hs.152940 



Hs.181785 
Hs.268695 
Hs.172180 



LATS (large tumor suppressor, Drosophila) homolog 2 

novel protein 

ESTs 

DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 15 

KIAA0758 protein 

ESTs 

gb:yf16g11.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:1 27076 3' 

ub'quitin-conjugating enzyme E2G 2 (homologous to yeast UBC7) 

ESTs 

ESTs 

ESTs 

KIAA0440 protein 

gb:HSCZYB122 normalized infant brain cDNA Homo sapiens cDNA clone c-zyb12, mRNA 



112540 RC_R70467 

130346 RC_R73565 

129534' RC_R73640 

112597 RC_R78376 

112732 RC_R92453 

131458 RC_T03865 

112888 RC_T03872 

131863 RC_T10072 

112911 RC_T10080 

132215 RC T10132 

112931 RC_T15343 

112984 RC.T23457 

112998 RC_T23555 

133376 RC.T23570 

113026 RC_T23948 

113070 RC_T33464 

128970 RCT34413 

113074 RC.T34611 

113095 RC_T40920 

113179 RC_T55182 

113337 RCJ77453 

113421 RC_T84039 

113454 RCJT86458 

113481 RC_T87693 

131441 RC_T89350_s 

113557 RC_T90945 

113559 RCJ90987 

113589 RCJT918S3 

113591 RC T91881 

113619 RC_T93783_s 

113683 RC T96687 

113692 RCJ96944 

113702 ROJ97307 
mRNA 

113717 RC_T97764 

113824 RCJV48817 

113840 RC_W58343 

113844 RC_W59949 
PROTEIN TC10 

113902 RC_W74644 

113904 RC W74761 

113905 RC_W74802 

113931 RCJA/81205 

113932 RC W81237 
131965 RC_W90146_f 
114035 RC W92798 
11410S RC_Z38412 
133593 RC_Z38709 
114161 RC_Z38904 
424949 RC.Z39103 
129059 RC_Z39930_f 
128937 RC.Z39939 
WARNING 

130983 RC_Z40012_i 



R69751 

H05769 

AK002126 

R78376 

R92453 



AW195317 
AI656378 
AW732747 
AL035703 



AA376654 
AB032977 
AI375672 
AK001335 



T77453 

AI769400 

AI022166 

T87693 

AA302862 

H66470 



AB035335 
AL360143 
T97307 

T99513 
AI631964 
R72137 
AI369275 

M340111 
AF1 25044 
R81733 



Hs.183874 
Hs.22627 
Hs.21893 
Hs.26125 



Hs.188757 

Hs. 11260 

Hs.29733 

Hs.34590 

Hs.27047 

Hs.107716 

Hs.33461 

Hs.13493 

Hs.4236 

Hs.167428 

Hs.289014 

Hs.22968 

Hs.7232 

Hs. 183684 

Hs.6298 

Hs.165028 

Hs.31137 

Hs. 126733 

Hs.152571 

Hs.302234 

Hs.189729 

Hs.16188 

Hs.204327 

Hs.90063 

Hs.16004 

Hs.14514 

Hs.15682 

Hs.200597 

Hs.17244 

Hs.144519 



Hs.243010 

Hs.100009 

Hs.19196 

Hs.33106 

Hs.3496 

Hs.126485 



AF052212 
AW069534 
AA251380 

AI479813 



Hs.299883 
Hs.153934 
Hs.279583 
Hs.10726 



hypothetical protein DKFZ P 761N0624 



gb:yi40a10.s1 Soares placenta Nb2HP Homo sapiens cDNA clone 3', mRNA se 

Homo sapiens, clone MGC:5564, mRNA, complete cds 

hypothetical protein FLJ 11 264 

EST 

ESTs 

hypothetical protein FLJ20392 
hypothetical protein FLJ22344 
ESTs 

like mouse brain protein E46 
KIAA0478 gene product 
ESTs 

ESTs, Weakly similar to A43932 mucin 2 precursor, intestinal [H.sapiens] 

Homo sapiens clone IMAGE:451939, mRNA sequence 

acetyl-Coenzyme A carboxylase alpha 

eukaryotic translation initiation factor 4 gamma, 2 

KIAA1151 protein 

ESTs 

protein tyrosine phosphatase, receptor type, E 
ESTs 

ESTs, Highly similar to IGF-II mRNA-binding protein 2 [H.sapiens] 

ESTs 

ESTs 



ESTs 

KIAA0563 gene product 
hypothetical protein FLJ 13605 
T-cell leukemia/lymphoma 6 
DKFZP434H132 protein 

gb:ye53h05.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:121497 3', 



Hs.269181 ESTs 



Homo sapiens cDNA FLJ14445 fs, clone HEMBB1001294, highly similar to GTP-BINDING 

acyl-Coenzyme A oxidase 1, palmitoyl 
ubiquitin-conjugating enzyme HBUCE1 
ESTs 

hypothetical protein MGC15749 

hypothetical protein FLJ 12604; KIAA1692 protein 

ESTs 



gb:RC5-BT0562-260100-01 1-A02 BT0562 Homo sapiens cDNA, mRNA sequence 

inositol 1 ,4,5-triphosphate receptor, type 2 
hypothetical protein FLJ23399 

core-binding factor, runt domain, alpha subunit 2; translocated to, 2 
CGI-81 protein 

ESTs, Weakly similar to ALU1_HUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 



Hs.278411 NCK-associated protein 1 
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114277 RC_Z40377_s 
[C.elegans] 
114304 RC_Z40820 
114364 RC.Z41680 
132900 RC_AA005112 
129034 RC AA005432 
131881 RC.AA010163 
452461 RCLAA026356 
114465 RC_AA026901 
131376 RC_AA0368S7 
101567 RC_AA044644 
431555 RG_AA046426 
132944 RC_AA054515 
114618 RC_AA084162 
130274 RC_AA085749 
110330 RC_AA098874 
114648 RC_AA101056 
IMAGE:548429 3' 
114658 RC_AA102746 
132456 RC_M114250_s 
131319 RC_M126561_s 
132225 RC_AA128980_i 
IMAGE:567164 3' 
132669 RC.AA129757 
114709 RCJW129921 
131973 RC_AA133331 
114750 RC_AA135958 
115714 RC_AA136524_s 
114763 RC_AA147044 
114767 RC.AA148885 
114774 RC_AA150043 
129388 RC.AA151621 
129183 RC_M155743 
128869 RCAA156335 
130207 RC_M156336 
114798 RC_AA159181 
114800 RC_M159825 
[Celegans] 

114828 RCAA234185 
114846 RC_M234929 
114848 RC_AA234935 
114902 RC.AA236359 
132271 RC_AA236466 



AI052229 Hs.25373 ESTs, Weakly similar to T20410 hypothetical protein E02A10.2 - Caenorhabditis elegans 



AI934204 
AL1 17427 
AA777749 
AA481157 
AW361018 
N78223 
BE621056 
AK001644 
M33552 
AI815470 
T96641 
AW979261 
AA128376 
AI288666 
AA101056 



Hs.16129 

Hs.172778 

Hs.5978 

Hs.108110 

Hs.3383 

Hs.108106 

Hs. 131 731 

Hs.26156 

Hs.56729 



Hs.291993 
Hs.153884 
Hs.16621 



Hs.249190 
Hs.48924 
Hs.25590 



M397651 
AB018284 
AA887211 



AA768242 
AF044209 
AA159181 
Z19448 



135159 RC_AA236935_s 
132204 RC_AA236942 
114928 RCAA237018 
132481 RC_AA237025 
114932 RC_AA242751 
314162 RC_AA242760 
131006 RC_AA242763 
114935 RC_M242809 
WARNING 

132454 RC_AA243133 
437754 RC_M243495 
114957 RC_AA243706 
114974 RC_M250848 
114977 RC_AA250868 
114995 RC_AA251152 
115005 RC_AA251544 
417177 RC_M251792 
131889 RCJA252063 
115026 RC_AA252144 
115045 RC_AA252524 
115068 RC_AA253461 
133138 RC_AA255522 
RECEPTOR, 
115114 RC_AA256468 
129584 RC_AA256528 
115137 RC_AA257976 
134312 RC_AA258296 

115166 RC_AA258409 

115167 RC_AA258421 
129807 RC_AA262077 
115239 RC.AA278650 
115243 RC_AA278766 



BE614347 

AW275480 

AB030034 

N29390 

U43374 

AA235827 

AA237018 

W93378 

AA971436 



BE296227 

R60366 

AW170425 

AW966931 

AW295978 



AB011151 
AF095727 
AA749209 
Y11192 



ESTs 

Homo sapiens mRNA; cDNA DKFZp566P013 (from clone DKFZp566P013) 

LIM domain only 7 

DKFZP547E2110 protein 

upstream regulatory element binding protein 1 

transcription factor 

hypothetical protein FLJ11099 

hypothetical protein FU10782 

lysosomal 

Cdc42 effector protein 3 

Homo sapiens cDNA: FLJ23020 fis, clone LNG00943 
ESTs 

ATP binding protein associated with cell differentiation 
DKFZP434I1 16 protein 

gb:zn25b03.s1 Stratagene neuroepithelium NT2RAMI 937234 Homo sapiens cDNA clone 

tumor necrosis factor receptor superfamily, member 10a 
KIAA0512 gene product; ALEX2 



AI760825 

NM 004458 

NMJ02589 

AA251972 

AW014549 

AW512260 

AV657594 



Hs.43728 
Hs.5299 
Hs.73291 
Hs.1 16665 



Hs.129467 
Hs.172572 
Hs.88977 
Hs.154443 
Hs.184325 
Hs.1 10964 
Hs.273369 
Hs.80618 
Hs.144904 
Hs.54900 
Hs.131887 

Hs.283522 
Hs.166196 
Hs.169615 
Hs.39504 
Hs.115175 
Hs.1 3804 
Hs.95631 
Hs.42265 
Hs.94869 
Hs.49614 
Hs.16218 
Hs.38516 
Hs.22116 
Hs.290880 

Hs.250822 



gb:zo09a1 1 ,s1 Stratagene neuroepithelium NT2RAMI 937234 Homo sapiens cDNA clone 

guanine nucleotide binding protein (G protein), gamma 3, linked 
proline synthetase co-transcribed (bacterial homolog) 
KIAA0741 gene product 
ESTs 

hypothetical protein FLJ20093 

hypothetical protein dJ511E16.2 

minichromosome maintenance deficient (S. cerevisiae) 4 

CGI-76 protein 

hypothetical protein FU23471 

uncharacterized hematopoietic stem/progenitor cells protein MDS027 



hypothetical protein 



Homo sapiens mRNA; cDNA DKFZp434J1912 (from clone DKFZp434J1912) 

ATPase, Class I, type 8B, member 1 

hypothetical protein FLJ20989 

hypothetical protein MGC4308 

sterile-alpha motif and leucine zipper containing kinase AZK 

hypothetical protein dJ462023.2 

Human normal keratinocyte mRNA 

ESTs 

ESTs 

ESTs 

KIAA0903 protein 

Homo sapiens, clone MGC:15887, mRNA, complete cds 
CDC 14 (cell division cycle 14, S. cerevisiae) homolog B 

ESTs, Weakly similar to ALULHUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 



in -Hike 1 



ESTs 
ESTs 

fatty-acid-Coenzyme A ligase, long-chain 4 

BH-protocadherin (brain-heart) 

ESTs 

ESTs 

ESTs 

Homo sapiens cDNA FLJ14643 fis, clone NT2RP2001597, weakly similar to RYANODINE 



Hs.7527 

Hs.184325 

Hs.56155 



hypothetical protein MGC14139 
myelin protein zero-like 1 
hypothetical protein 

aldehyde dehydrogenase 5 family, member A1 (succinate-semialdehyda dehydrogenase) 
hypothetical protein FLJ10881 
KIAA1842 protein 
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100850 RC AA279667_£ 


i AA836472 


Hs.297939 


cathepsin B 


126884 RC AA280791 


U4943S 


Hs.286236 


KIAA1856 protein 


115322 RC AA280819 


L08895 


Hs.78995 


MADS box transcription enhancer factor 2, polypeptide C (myocyte enhancer factor 2C) 


133626 RC_AA280828 


AW836130 


Hs.75277 


hypothetical protein FU13910 


115372 RC AA282195 


AW014385 


Hs.88678 


ESTs, Weakly similar to Unknown [H.sapiens] 


132825 RC AA283127 s 


: U82671 


Hs.57698 


Empirically selected from AFFX single probeset 


130269 RC AA284694 


F05422 


Hs.168352 


nucleoporin-like protein 1 


129192 RC AA291137 


AA286914 


Hs. 183299 


ESTs 


452598 RC AA291708 


AI831594 


Hs.68647 


ESTs, Weakly similar to ALU7JHUMAN ALU SUBFAMILY SQ SEQUENCE CONTAMINATION 


WARNING 






132131 RC_M293495 


AF069291 


Hs.40539 


chromosome 8 open reading frame 1 


115536 RC AA347193 


AK001468 


Hs.62180 


anillin (Drosophila Scraps homolog), actin binding protein 


132411 RC_AA398474_s 


i AA059412 


Hs.47986 


hypothetical protein MGC10940 


115575 RC AA398512 


M393254 


Hs.43619 


ESTs 


115601 RC_AA400277 


AA148984 


Hs.48849 


ESTs, Weakly similar to ALU4.HUMAN ALU SUBFAMILY SB2 SEQUENCE CONTAMINATION 


WARNING 








103928 RC_M400896 


D14540 


Hs.199160 


myeloid/lymphoid or mixed-lineage leukemia (trithorax (Drosophila) homolog) 


125819 RC_AA404494 


AA044840 


Hs.251871 


CTP synthase 


115683 RC_AA410345 


AF255910 


Hs.54650 


junctional adhesion molecule 2 


115715 RC_M416733 


BE395161 


Hs.1390 


proteasome (prosome, macropain) subunit, beta type, 2 


132952 RC_M425154 


AI658580 


Hs.61426 


Homo sapiens mesenchymal stem cell protein DSC96 mRNA, partial cds 


115819 RC AA426573 


AA486620 


Hs.41135 


endomucin-2 


132525 RC_AA431418 


AW292809 


Hs.50727 


N-acetylglucosaminidase, alpha- (Sanfilippo disease IIIB) 


115895 RC_AA436182 


AB033O35 


Hs.51965 


KIAA1209 protein 


132333 RC_AA437099 


AA192669 


Hs.45032 


ESTs 


115962 RC AA446585 


AI636361 


Hs.179520 


hypothetical protein MGC10702 


115967 RC AA446887 


AI745379 


Hs.42911 


ESTs 


115974 RC.AA447224 


BE513442 


Hs.238944 


hypothetical protein FLJ10631 


115985 RC AA447709 


AA447709 


Hs.268115 


ESTs, Weakly similar to T08599 probable transcription factor CA1 50 [H.sapiens] 


129254 RC_AA453624 


AA252468 


Hs.1098 


DKFZp434J1813 protein 


133071 RC_AA455044 


BE384932 


Hs.64313 


ESTs, Weakly similar to AF257182 1 G-protein-coupled receptor 48 [H.sapiens] 


116095 RC_M456045 


AA043429 


Hs.62618 


ESTs 


122691 RC AA460454_; 


i R19768 


Hs.172788 


ALEX3 protein 


116210 RC_AA476494 


BE622792 


Hs.172788 


ALEX3 protein 


116213 RC AA476738 


AA292105 


Hs.326740 


hypothetical protein MGC10947 


134585 RC AA481422 


D14041 


Hs.278573 


H-2K binding factor-2 


134790 RCJW482269 


BE002798 


Hs.287850 


integral membrane protein 1 


116265 RC_AA482595 


BE297412 


Hs.55189 


hypothetical protein 


129334 RC AA485084 : 


; AW1 57022 


Hs.4947 


hypothetical protein FU22584 


116274 RC_AA485431_: 


i AI129767 


Hs.182874 


guanine nucleotide binding protein (G protein) alpha 12 


303150 RC_AA489057 


AA887146 


Hs.8217 


stromal antigen 2 


129945 RC.AA489638 


BE514376 


Hs.165998 


PAI-1 mRNA-binding protein 


116331 RC_AA491000 


N41300 


Hs.71616 


Homo sapiens mRNA; cDNA DKFZp586N1720 (from clone DKFZp586N1720) 


116333 RC AA491250 


AF155827 


Hs.203963 


hypothetical protein FLJ10339 


132994 RC_AA505133 


AA1 12748 


Hs.279905 


clone HQ031OPRO0310p1 


134577 RC AA598447 


BE244323 


Hs.85951 


exportin, tRNA (nuclear export receptor for tRNAs) 


116391 RC AA599243 


T86558 


Hs.75113 


general transcription factor IIIA 


116394 RC_AA599574j NMJ06033 


Hs.65370 


lipase, endothelial 


134531 RC_AA600153 


AI742845 


Hs.110713 


DEK oncogene (DNA binding) 


116417 RC_AA609309 


AW499664 


Hs.12484 


Human clone 23826 mRNA sequence 


116429 RC_AA609710 


AF191018 


Hs.279923 


putative nucleotide binding protein, estradiol-induced 


116439 RC AA610068 


AA251594 


Hs.43913 


PIBF1 gene product 


116459 RC AA621399 


R80137 


Hs.302738 


Homo sapiens cDNA: FLJ21425 fls, clone COL04162 


427505 RC_AA621752 


AA361562 


Hs.1 78761 


26S proteasome-associated pad1 homolog 


132699 RC_C21523 


AW449822 


Hs.55200 


ESTs 


116541 RC_D12160 


D12160 


Hs.249212 


polymerase (RNA) III (DNA directed) (155kD) 


132557 RC_D19708 


AA1 14926 


Hs.5122 




112259 RC_D25B01 


AA337548 


Hs.333402 


hypothetical protein MGC12760 


116571 RC_D45652 


D45652 




gb:HUMGS02848 Human adult lung 3' directed Mbol cDNA Homo sapiens cDNA 3', mRNA 


sequence. 








129815 RC_D60208_f 


BE565817 


Hs.26498 


hypothetical protein FLJ21657 


421919 RC_D80504_s 


AJ224901 


Hs.1 09526 


zinc finger protein 198 


116643 RC F03010 


AI367044 


Hs.1 53638 


myeloid/lymphoid or mixed-lineage leukemia 2 


116661 RC_F04247 


R61504 




gb:yh16a03.s1 Soares infant brain 1NIB Homo sapiens cDNA clone 3' similar to contains Alu 



116715 RC_F10966 

116729 RC_F13700 

318709 RC_H05063 

134760 RC_H16758 

116773 RC_H17315_s 

106425 RCH22556 

116780 RCJ422566 

131978 RC_H48459_s 

116819 RCH53073 

111428 RC_H56559_s 

133175 RC_H57957_s 



NM.000121 

AI823410 

H24201 



Hs.1 70263 
Hs.1 15823 
Hs.285280 
Hs.89548 
Hs.169149 
Hs.247423 
Hs.30098 
Hs.36232 
Hs.93698 
Hs.174174 



tumor protein p53-binding protein, 1 

ribonuclease P, 40kD subunit 

Homo sapiens cDNA: FLJ22096 (is, clone HEP16953 

erythropoietin receptor 

karyopherln alpha 1 (importin alpha 5) 

adducin2(beta) 

ESTs 

KIAA0186 gene product 
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116844 RC_H64938_s H64938 

116845 RC_H 



I RC_H 



AI573283 
H73110 
N29218 
AC005757 



116925 RC_H73110 
116981 RC_H81783 
131768 RC_H86259 
117031 RC_H88353 H88353 
contains L1 

117034 RC.H88639 U72209 
132542 RC_H88675 AL137751 
134403 RC_H93708_s AA334551 
117280 RC.N22107 M18217 
117344 RC_N2404B 
117422 RC_N27028 
117475 RC_N30205 
117487 RC_N30621 
130207 RC_N33258 
117549 RC_N33390 
117683 RC.N40180 
IMAGE:276387 3' similar to 
117710 RC.N45198 N45198 
104514 RC N45979_s AF164622 
117791 RC.N48325 
117822 RC_N48913 
129647 RC N49394 
117895 RC_N50656 
[H.sapiens] 
131557 RC N50721 
133057 RC_N53143 
118103 RC.N55326 
118111 RC_N55493 
mRNA 

118129 RC_N57493 
IMAGE:2773583',mRNA 
118278 RC_N62955 N62955 



Hs.38458 
Hs.260603 
Hs.40290 



R19085 

AI355562 

N30205 

N30621 

AF044209 

N33390 

N40180 



Hs.93740 
Hs.44203 
Hs.144904 
Hs.44483 



Hs.93956 EST 



ESTs, Weakly similar to A46010 X-linked retinopathy protein [H.sapiens] 
gb:ns44f05.s1 NCI_CGAP_Alv1 Homo sapiens cDNA clone, mRNA sequence 
ESTs 

ESTs, Moderately similar to A47582 B-cell growth factor precursor [H.sapiens] 
ESTs 

hypothetical protein 

gb:yw21a02.s1 Morton Fetal Cochlea Homo sapiens cDNA clone IMAGE:252842 3' similarto 
YY1 -associated factor 2 

Homo sapiens mRNA; cDNA DKFZp434l0812 (from clone DKFZp434l0812); partial cds 
sperm specific antigen 2 

Homo sapiens cDNA: FLJ21409 fis, clone COL03924 

Homo sapiens cDNA FLJ13182 fis, clone NT2RP3004070 

ESTs, Weakly similar to A46010 X-linked retinopathy protein [H.sapiens] 

ESTs, Weakly similar to I38022 hypothetical protein [H.sapiens] 

ESTs 

nuclear receptor co-repressor 1 
EST 

gb:yy44d02.s1 Soares_multiple_sclerosis_2NbHMSP Homo sapiens cDNA clone 

ESTs, Highly similar to similar to Cdc14B1 phosphatase [H.sapiens] 
golgin-67 



AB018259 Hs.118140 
AW450348 Hs.93996 



AA317439 
AA465131 
AA401733 



N57493 



AI183838 
N46114 
N66845 



NMJ303528 Hs.2178 



3', ml 

118336 RC.N63604 
132457 RCJI64166 

118363 RC N64168 

118364 RC.N64191 
118475 RC.N66845 
similarto 

118491 RC_N67135 
118500 RC_N67295 
101663 RC_N68399 
118584 RC_N68963 
sequence 

421983 RC N69331 
118661 RC_N70777 
118684 RC_N71364_s 

118689 RC N71545 s 

118690 RC_N71571 
118765 RCJJ74456 
118793 RC N75594 
118817 RC.N79035 
118844 RC N80279 
118919 RCJI91797 
129558 RCJ192454 
132692 RC_N94581 
118996 RC.N94746 N94746 
119021 RC_N98238 N98238 
119039 RC R02384 AI160570 
119063 RC_R16833 R16833 
WARNING 

Y07759 
T02865 
AA214228 
119146 RC_R58863 R58863 
120296 RC.R78248 AW995911 
119239 RC_T11483 T11483 



AI252640 

AL137554 

N71313 

AW390601 

N71571 

N74456 

N75594 

AI668658 

AL035364 

AW452696 



Hs.184544 
Hs.269142 
Hs.50499 



Hs.50797 
Hs.50891 
Hs.130760 
Hs.180446 



119111 RC_R43203 T02865 



AI692322 

NM_001241 

T10077 



Hs.65373 
Hs.155478 
Hs.13453 
Hs.94030 



ESTs 

KIAA0716 gene product 

ESTs, Highly similar to SORL_HUMAN SORTILIN-RELATED RECEPTOR PRECURSOR 

signal sequence receptor, gamma (translocon-associated protein gamma) 

Homo sapiens clone 25218 mRNA sequence 

ESTs 

gb:yv50c02.s1 Scares fetal liver spleen 1 NFLS Homo sapiens cDNA clone IMAGE:246146 3', 

gb:yy54c08.s1 Soares_mu!tiple_sclerosis_2NbHMSP Homo sapiens cDNA clone 

Homo sapiens cDNA FLJ11375 fis, clone HEMBA100041 1, weakly similarto ANKYRIN 
gb:yy52f01.s1 Soares_multiple_sclerosis_2NbHMSP Homo sapiens cDNA clone IMAGE:278137 

HT021 

frizzled (Drosophila) homolog 7 
hypothetical protein FLJ21 802 
hypothetical protein FLJ22623 

gb:za46d 1 .s1 Soares fetal liver spleen 1 NFLS Homo sapiens cDNA clone IMAGE:295604 3' 

Homo sapiens cDNA: FLJ23285 fis, clone HEP09071 
ESTs 

H2B histone family, member Q 

gb:UI-H-BJ1-adp-d-08-0-Ul.s1 NC)_CGAP_Sub3 Homo sapiens cDNA clone 3', mRNA 

peptidylprolyl isomerase C (cyclophilin C) 
protein kinase NYD-SP15 

Homo sapiens cDNA: FLJ22765 fis, Clone KAIA1 180 
Homo sapiens, clone IMAGE:3355383, mRNA, partial cds 
ESTs 
EST 

ESTs, Mcderately similar to T47135 hypothetical protein DKFZp761L0812.1 [H.sapiens] 
ESTs 

hypothetical protein 

myosin phosphatase, target subunit 2 

karyophsrin (importin) beta 1 

collagen, type VIII, alpha 2 

hypothetical protein FLJ20758 

ESTs 

pregnancy specific beta-1 -glycoprotein 6 

ESTs, Moderately similar to ALU1JHUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 

myosin VA (heavy polypeptide 12, myoxin) 
EST 

hypothetical protein 



ESTs, Weakly similar to T02345 hypothetical protein KIAA0324 [H.sapiens] 
cyclin T2 

hypothetical protein FLJ14753 

Homo sapiens mRNA; cDNA DKFZp586E1624 (from clone DKFZp586E1624) 
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119558 RC_W38194 W38194 

132736 RC_W42414_s AW081883 
sapiens mad protein 

132173 RC_W46577_s X89426 

134873 RC_W49632_s AA884471 

119650 RC_W57613 R82342 

119654 RC_W57759 W57759 



Hs.41716 
Hs.90449 
Hs.79856 



AA041350 
W69216 
AI287518 
AW014862 



AW082866 

Z40182 

Z40904 

AW959615 

AA1 67500 

W90403 

AW014786 



119683 RC_W61118 

119694 RC_W65344 

119718 RC_W69216 

133010 RC.W69379 

119938 RC_W86728 

120128 RC_Z38499 

120130 RC_Z38630 

120148 RC_Z39494 

120155 RC_Z39623 

131486 RC_Z40071_s 

120183 RC_Z40174 

120184 RC_Z40182 
120211 RC_Z40904 
120245 RC_AA166965 
120247 RC_AA167500 
120254 RC_AA169599_i 

120259 RC.AA171724 

120260 RC_AA171739 
120275 RC_AA177105 
120284 RC_M182626 



114056 RC_AA186324 AA188175 
129507 RC_AA192099 AJ236885 

120302 RC_AA192173 AA837098 

120303 RC_AA192415 AI216292 
120305 RC_AA192553 AW295096 
120319 RC_M194851 T57776 
133389 RC_AA195520_s AA195764 

120326 RC_AA196300 AA196300 
134272 RC_AA196517 X76040 
133145 RC_AA196549 H94227 

120327 RC_M196721 
10S686 RC_AA196729_i N66397 

120328 RC_M196979 AA923278 
120340 RC_AA206828 AA206828 
similar to 

134292 RC_AA207123 AI906291 
131522 RC_AA214539_i AI380040 
129051 RC_AA226914_s AA227068 

120375 RC_AA227260 AF028706 

120376 RC.AA227469 AA227469 
IMAGE:663732 3', mRNA sequence. 
120390 RC_AA233122 AA837093 
303876 RC_AA233334_s U64820 
dominant, ataxin 3) 

132038 RC_AA233347 AI825842 
104463 RCJW233519 T85825 
125750 RC_AA233714 AA018515 
120396 RC_M233796 AA134006 
120409 RC_AA235050J AA235050 
gb:L07077 

120414 RC_AA235704 AW137156 
120420 RC_AA236031 AI128114 

120422 RC_AA236352 AL133097 
132221 RC_AA236390_s W94915 

120423 RC_AA236453 AA236453 
120435 RC_AA243370 AA243370 
120453 RC_AA250947 AA250947 

120455 RC_AA251083 AA251720 

120456 RC_AA251113 AA488750 
120473 RC_AA251973 AA251973 
128922 RC_AA252023 AI244901 
120477 RC_AA252414 AA252414 
120479 RC_AA252650 
120488 RC_AA255523 
120510 RC_AA258128 

120527 RC_AA262105 

120528 RC_AA262107 



Hs.57835 
Hs.57847 
Hs.92848 
Hs.62669 
Hs.58885 
Hs.91448 
AA045767 Hs.5300 
Hs.65765 
Hs.65783 
Hs.27372 



Hs.65885 
Hs.66012 
Hs.111045 
Hs.1 03939 
Hs.111054 
Hs.1 92742 
Hs.101590 
Hs.78457 




AK000292 Hs.278732 



Hs.81234 
Hs.239489 
Hs.108301 
Hs.1 11 227 



Hs.3776 
Hs.246885 
Hs.264482 
Hs.79306 



endothelial cell-specific molecule 1 
Human clone 23908 mRNA sequence 

ESTs, Weakly similar to S65657 alpha-1C-adrenergic receptor splice form 2 [H.sapiens] 
gb:zd20g1 1 .s1 Soares_fetal_hearLNbHH19W Homo sapiens cDNA clone IMAGE:341252 3' 

ESTs 

ESTs, Moderately similar to ICE4JHUMAN CASPASE-4 PRECURSOR [H.saplens] 
ESTs 

Homo sapiens mRNA; cDNA DKFZp586D0923 (from clone DKFZp586D0923) 
ESTs 

MKP-1 like protein tyrosine phosphatase 
bladder cancer associated protein 
ESTs 
ESTs 

BMX non-receptor tyrosine kinase 

ESTs 

EST 

EST 

ESTs 

EST 

ESTs 

hypothetical protein FLJ 12785 
hypothetical protein 

solute carrier family 25 (mitochondrial carrier; ornithine transporter) member 15 

gb:zp54e1 1 .s1 Stratagene NT2 neuronal precursor 937230 Homo sapiens cDNA clone 3' similar 

KIAA1254 protein 
zinc finger protein 148 (pHZ-52) 
ESTs 
ESTs 

uncoupling protein 3 (mitochondrial, proton carrier) 



hypothetical protein RG083M05.2 
protease, serine, 15 
Homo sapiens, clone IMAGE:2961368, mRNA, partial cds 
hypothetical protein FLJ20285 
Homo sapiens cDNA FLJ 14752 fe, clone NT2RP3003071 
ESTs, Weakly similar to protease [H.sapiens] 

gb;zq80b08.s1 Stratagene hNT neuron (937233) Homo sapiens cDNA clone IMAGE:647895 3' 

immunoglobulin superfamily, member 3 

TIA1 cytotoxic granule-associated RNA-binding protein 

nuclear receptor subfamily 2, group C, member 1 

Zic family member 3 (odd-paired Drosophila homolog, heterotaxy 1) 

gb:zr18a07.s1 Stratagene NT2 neuronal precursor 937230 Homo sapiens cDNA clone 

calcium/calmodulin-dependent protein kinase (CaM kinase) II delta 

Machado-Joseph disease (spinocerebellar ataxia 3, olivopontocerebellar ataxia 3, autosomal 

zinc finger protein 216 
hypothetical protein FLJ20783 

Homo sapiens mRNA; cDNA DKFZp761A0411 (from clone DKFZp761A0411) 
eukaryotic translation initiation factor 4E 

gb:zs38e04.s1 Soares_NhHMPu_S1 Homo sapiens cDNA clone IMAGE:687486 3' similar to 



Hs.112885 
Hs.301717 
Hs.42419 
Hs.18978 
Hs.96450 
Hs.170263 
Hs.104347 
Hs.88414 



AW952916 
AI796395 
AA262105 
AI923511 



Hs.43141 

Hs.110299 

Hs.63510 

Hs.111377 

Hs.4094 

Hs.104413 



hypothetical protein FLJ 10038 
spinal cord-derived growth factor-B 
hypothetical protein DKFZp434N1928 
ESTs 

Homo sapiens cDNA: FLJ22822 fis, clone KAIA3968 
EST 

tumor protein p53-binding protein, 1 

ESTs, Weakly similar to ALUC.HUMAN III! ALU CLASS C WARNING ENTRY III [H.sapiens] 

BTB and CNC homology 1, basic leucine zipper transcription factor 2 

ESTs 

ublquilin 1 

DKFZP727C091 protein 
mitogen-activated protein kinase kinase 7 
KIAA0141 gene product 
ESTs 

Homo sapiens cDNA FLJ14208 fis, clone NT2RP3003264 
ESTs 
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120529 RC_AA262235 
120541 RC_AA278298 
131445 RC_AA278529j 
120544 RC_AA278721 
120562 RC_AA280036 
120569 RC_AA280648 

120571 RC AA280738 

120572 RC_AA280794 
129434 RC_AA280837 
130529 RC_AA280886 
repetitive 

120575 RC_AA280934 
132635 RC_AA281535 
120591 RC_M281797_s 
120593 RC_AA282047 
430275 RC_AA283002 
117729 RC_AA28370G 
120609 RCJ\A283902 
132754 RC_AA284108 
130315 RC_AA284109 
132614 RC_AA284371 
447503 RC_AA284744 f 
cds 

135376 RC_AA284784 
120621 RC_AA284840 
107868 RC_AA286844 
129868 RC_AA287032 
120644 RC_AA287038 

120660 RC_AA287546 
135370 RC_AA287553_s 

120661 RC_AA287556 
129116 RC_AA287564 
131567 RC_AA291015_s 
120699 RCAA291716 
100690 RC_AA291749_s 
120726 RCAA293656 
120737 RC_AA302430 
120745 RC_AA302809 
135192 RC_AA302820_s 
120750 RC_AA310499 
120761 RC_AA321890 

120768 RC_AA340589 

120769 RC_AA340622 
135232 RC_AA342457j 
CONTAMINATION 
133439 RC_AA342828_s 
120793 RC_AA342864 
120796 RCJ\A342973 
120809 RC_AA346495 
repeat, mRNA sequence. 
132459 RC_AA347573 
120825 RC_AA347614 
120827 RC_AA347717 
120839 RC_AA348913 
repeat, mRNA sequence, 
120850 RC_AA349647 
120852 RC_AA349773 
128852 RC_AA350541_s 
135240 RC_AA357159_i 
120870 RC_AA357172_i 
WARNING 

134637 RC_AA369856_s 
120894 RC_AA370132 
131854 RC_AA370472_s 
120897 RC_AA370867 
120915 RC_AA377296 

120935 RC_AA383902 
WARNING 

120936 RC_AA385934 

120937 RC.AA386255 

120938 RC_AA386260 
129722 RC_AA386266 
120960 RC_AA398014 
120985 RC_AA398222 
120988 ROAA398235 



AI434823 
W07318 
NM_014264 
BE548277 



AF078847 
AA748355 
Z11773 



AW978721 
AI752244 
AI241084 
AA284371 
AA1 15496 

BE617856 
AW961294 
AA286844 
AW1 72431 
AI869129 



AA287558 
AB019494 
AF015592 
AI683243 
AA383256 



AL049176 



Z23091 
AA342864 
AI247356 
AA346495 



Hs. 104415 ESTs 

Hs.240 M-phase phosphoprotein 1 

Hs.172052 serine/threonine kinase 18 

Hs.103104 ESTs 

Hs.302267 hypothetical protein FLJ10330 

Hs.24970 ESTs, Weakly similar to B34323 GTP-binding protein Rab2 [H.sapiens] 

Hs.34892 KIAA1 323 protein 

Hs.294008 ESTs 

Hs.1 86644 ESTs 

gb:zp39e03.s1 Stratagene muscle 937209 Homo sapiens cDNA clone 3' similar to contains Alu 

Hs.238911 hypothetical protein DKFZp762E151 1 ; K1AA1816 protein 

Hs.191356 general transcription factor liH, polypeptide 2 (44kD subunit) 

Hs. 193522 ESTs 

Hs.237786 zinc finger protein 187 

Hs.7145 calpain 7 

Hs.266076 ESTs, Weakly similar to A46010 X-linked retinopathy protein [H.sapiens] 

Hs.75309 eukaryotic translation elongation factor 2 

Hs.154353 nonselective sodium potassium/proton exchanger 

Hs.1 18064 similar to rat nuclear ubiquitous casein kinase 2 

Hs.336898 Homo sapiens, Similar to RIKEN cDNA 1810038N03 gene, clone MGC:9890, mRNA, complete 

Hs.99756 mitochondrial ribosome recycling factor 

Hs.143818 hypothetical protein FLJ23459 

Hs.61260 hypothetical protein FLJ13164 

Hs.13012 ESTs 

Hs.96616 ESTs 

Hs.99677 ESTs 

Hs.99670 ESTs, Weakly similarto 138022 hypothetical protein [H.sapiens] 

Hs.263412 ESTs, Weakly similar to ALUB_HUMAN 111! ALU CLASS B WARNING ENTRY !!! [H.sapiens] 

Hs.225767 IDN3 protein 

Hs.28853 CDC7 (cell division cycle 7, S. cerevislae, homolog)-like 1 

Hs.97258 ESTs, Moderately similarto S29539 ribosomal protein L13a, cytosolic [H.sapiens] 

Hs.1657 estrogen receptor 1 

Hs.97293 ESTs 

Hs.82223 chordin-like 

gb:EST10426 Adipose tissue, white I Homo sapiens cDNA 3' end, mRNA sequence. 

Hs.321709 purinergic receptor P2X, ligand-gated ion channel, 4 

Hs.96693 ESTs, Moderately similar to 21 09260A B cell growth factor [H.sapiens] 

Hs.1265 branched chain keto acid dehydrogenase E1, beta polypeptide (maple syrup urine disease) 

Hs.1 04560 EST 

Hs.96769 ESTs 

Hs.96800 ESTs, Moderately similar to ALU7JHUMAN ALU SUBFAMILY SQ SEQUENCE 

glycoprotein V (platelet) 
ESTs 
ESTs 

gb:EST52657 Fetal heart II Homo sapiens cDNA 3' end similar to EST containing O family 



AL120071 Hs.48998 fibronedn leucine rich transmembrane protein 2 

AI280215 Hs.96885 ESTs 

AA382525 Hs.132967 Human EST clone 122887 mariner transposon Hsmarl sequence 

AA348913 gb:EST55442 Infant adrenal gland II Homo sapiens cDNA 3' end similarto EST containing Alu 

AA349647 Hs 
AA349773 Hs 

R40622 Hs.1 06601 ESTs 

AA357159 Hs.96986 EST 

AA357172 Hs.292581 ESTs, Moderately similar to ALU1.HUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 

11 vacuolar protein sorting 41 (yeast homolog) 

3 ESTs 

AF229839 Hs.173202 l-kappa-B-interacting Ras-like protein 1 

3 ESTs, Moderately similar to AF174605 1 F-box protein Fbx25 [H.sapiens] 

1 ESTs 

7 ESTs, Weakly similar to ALU1_HUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 



AA370867 
AL135556 
AL048409 

AA385934 

AA386255 

AA386260 

R20855 

AA398014 

AI219896 

AA398235 



Hs.97184 

Hs.97186 

Hs.104632 

Hs.5422 

Hs.104684 

Hs.97592 

Hs.97631 



EST, Highly similar to (defline not available 7499603) [C.elegans] 
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121008 RC_AA398348 
GSSs and a CpG 
121029 RC.AA398482 

121032 RC_AA398504 

121033 RC_AA398505 

121034 RC AA398507 

121035 RC_AA398523 
121058 RC_AA398625 

121060 RC_AA398632 

121061 RC_AA398633 

121091 RCJ\A398894 
CONTAMINATION 

121092 RC AA398895 
121094 RC_AA398900 
121096 RC_AA398904 
121115 RC_AA399122 

121121 RC_AA399371 

121122 RC AA399373 
121125 RC_AA399441 
121151 RC_AA399636 
121153 RCJ\A399640 
121163 RC_AA399680 
121176 RC_AA400080 
121192 RC_AA4C0262 
121223 RC_AA400725 
121227 RC_AA400748 
121231 RC_AA400780 

121278 RC_AA401631 

121279 RC_AA401688 
121282 RC_AA401695 
121299 RC.AA402227 

121301 RC_AA402329 

121302 RC_AA402398 

121304 RC_AA402449 

121305 RC_AA402468 
134721 RC_AA403268_s 

121323 RC_AA403314 

121324 RC.AA404229 
129047 RC_AA404260 
131074 RC.AA404271 
121344 RC_AA405026 
121348 RC.AA405182 
121350 RC_AA405237 
contains Alu 

121400 RC_AA406061 

121402 RC_AA406063 

121403 RC_AA406070 
121408 RC_AA406137 
121431 RC_AA406335 
132936 RC_AA411801 
121471 RCJWH1804 
121474 RC_AA411833 
121526 RC_AA412219 
121530 RC_AA412259 

121558 RC_AA412497 
contains L1.t3L1 

121559 RC_AA412498 
121584 RC_AA416586 
121609 RC_AA416867 
121612 RC_AA416874 
121737 RC_AA421133 
121740 RC_AA421138 
129194 RC.AA422079 
121784 RC.AA423837 

121802 RC_AA424328 

121803 RC_AA424339 
135286 RC_AA424469_i 
121806 RCJM24502 
129517 RC_AA425004 
121845 RC_AA425734 
CONTAMINATION 
121853 RC_AA425887 
121891 RCAA426456 
121895 RC_AA427396 
similar to contains 
121899 RC_AA427555 



Hs.301720 Human DNA sequence from clone RP11-2i 



13 Contains ESTs, STSs, 



AA398482 Hs.97641 
AA393037 Hs.161798 
AA398505 Hs.97360 
AL389951 Hs.271623 
AA398523 Hs.210579 
Hs.97391 



AA398895 Hs.97658 



AA398187 
AA399371 
AI126713 
AL042981 



AA399640 

AI676062 

AL121523 

AA400262 

AI002110 

AA400748 

AA814948 

AA037121 

AA292873 

AA401695 

AA402227 

NM_006202 

AA402587 



Hs.1 92233 
Hs.251278 
Hs.143629 



AA402468 

AK000112 

AA291411 

AA404229 

AI768623 

U16125 

AA405026 

AA405182 

AA405237 

AA406061 
AA406063 
AA406070 
AA406137 
AA035279 
AL120659 
AA411804 
AA402335 
AW665325 
AA778658 
AA412497 

AI192044 

AI024471 

AA416867 

AA416874 

AA421133 

AA421138 

AA1 50797 

T90789 

AI251870 

AI338371 

AW023482 

AA424313 

AW972853 



Hs.325520 
Hs.97316 
Hs.291557 
Hs.89306 
Hs.97247 
Hs.97842 
Hs.108264 
Hs.181581 
Hs.193754 
Hs.97973 



ESTs 
ESTs 

nucleoporin 50kD 



ESTs, Moderately similar to ALU8.HUMAN ALU SUBFAMILY SX SEQUENCE 



gb:zt62h10.r1 Soaresjestis NHT Homo sapiens cDNA clone 5', mRNA sequence 
ESTs 

ESTs, Weakly similar to mitochondrial citrate transport protein [H.sapiens] 
similar to SALL1 (sal (Drosophila)-like 

ESTs, Highly similar to T00337 hypothetical protein KIAA0568 [H.sapiens] 

KIAA1201 protein 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs, Weakly similar to dJ667H12.2.1 [H.sapiens] 

Homo sapiens mRNA; cDNA DKFZp434D024 (from clone DKFZp434D024) 

ESTs, Weakly similar to ALUC.HUMAN III! ALU CLASS C WARNING ENTRY !!! [H.sapiens] 

Homo sapiens cDNA FLJ11490 fis, clone HEMBA1001918 

ESTs 

ESTs 

tropomodulin 3 (ubiquitous) 

phosphodiesterase 4A, cAMP-specific (dunce (Drasophila)-homolog phosphodiesterase E2) 
LAT1-3TM protein 



gb:zt06e10.s1 NC)_CGAP_GCB1 Homo sapiens cDNA clone IMAGE:712362 3' similar to 



Hs.98004 EST 



Hs.98019 EST 



ESTs 

aryl-hydrocarbon receptor nuclear translocator 2 
ESTs 

ESTs, Highly similar to Trad [H.sapiens] 

ESTs 

ESTs 

gb:zt95g1 2,s1 Soares_testis_NHT Homo sapiens cDNA clone IMAGE:730150 3' similar to 



Hs.104778 ESTs 
Hs.98232 ESTs 
Hs.98185 EST 



AA425887 
AA426456 
AA427396 



Hs.98334 EST 

Hs.109276 latexin protein 

Hs.94308 RAB35, member RAS oncogene family 

Hs, 188898 ESTs 

Hs.157173 ESTs 

Hs.97849 ESTs 

Hs.98402 ESTs 

Hs.1 12237 ESTs 

Hs.165066 ESTs, Moderately similar to ALU2JHUMAN ALU SUBFAMILY SB SEQUENCE 

Hs.98502 hypothetical protein FU14303 

Hs.98469 ESTs 

gb:zw33a02.s1 Soares ovary tumor NbHOT Homo sapiens cDNA clone IMAGE:771050 3' 

Hs.50421 KIAAO203 gene product 
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121917 RC_AA428218 

121918 RC_AA428242 

121919 RC_M428281 

121941 RCLM428865 

121942 RC_M428994 
121970 RC_AA429666 
121993 RC_M430181 
134660 RC_AA430184_s 
126753 RC_AA431288_s 
122022 RC_AA431293 

122050 RCLAA431478 

122051 RC_AA431492 
122055 RC.AA431732 
122105 RC_AA432278 
122125 RC_AA434411 
135235 RC_AA435512_i 
122162 'RC_AA435698 
129406 RC_AA435711 
318801 RC_AA435815_s 
12218S RC_AA435842 
122235 RC_AA436475 
129131 RCAA436489 
134664 RC_AA442060 
122310 RC_AA442079 
122334 RC_AA443151 
122382 RC_AA446133 
122425 RC_AA447145 
122431 RC_AA447398 
122450 RC_AA447643 
302653 RC_AA447742_s 
122477 RC_AA448226 
122500 RC_AA448825 
122522 RC_AA449444 
122536 RCJA450087 
122538 RC_AA450211 
122540 RC_M450244 
122560 RCAA452123 
421919 RC_AA452155 
122562 RC_AA452156 
mRNA 

122585 RCLAA453036 
122608 RC.AA453526 

122635 RC_AA454085 
similar to 

122636 RC.AA454103 
122653 RC_AA454642 
122660 RC_AA454935 
122703 RC_AA455323 
122724 RC_AA457395 
122749 RC_AA458850 
122772 RC_AA459662 
131098 RC_AA459668 
129045 RC_AA459679_s 
122777 RC_AA459702 
135362 RC_AA460017J 
122798 RC_AA460324 
122837 RC_AA461509 

122860 RCJ\A464414_i 
mRNA sequence. 

122861 RC_AA464428 
122910 RCJ\A470084 
132899 RC AA476606 s 
122967 RC_AA478521 
129560 RC_AA47B523 
123009 RC.AA479949 
128917 RCJWI81252 
123081 RC_AA485351 
123133 RC_AA487264 
123184 RC_AA489072 
129671 RC AA489630 

123233 RC_AA490225 
[H.sapiens] 

123234 RC_AA490227 
123236 RC_AA490255 
123255 RCLAA490890 
129503 RC_AA490916_s 



AA406397 
BE274689 
AA428281 
AA428865 

AW452701 Hs.293237 

AA429666 Hs.98517 

AW297880 Hs.98661 



Hs.98560 EST 
Hs.98563 ESTs 



AK000492 
AW298244 
AA628233 



AA436475 
AB026436 
AA256106 



AA447398 
AA447643 
AJ404468 
AA448226 
AA448825 
AA299607 
AF060877 
AA450211 
AA476741 



AI816827 
AA456323 
AA457395 



Hs.293507 
Hs.79946 
Hs.111138 
Hs.77965 
Hs.104673 
Hs.1 12227 
Hs.177534 
Hs.87507 
Hs.98974 



AA446440 Hs.98643 



Hs.99104 
Hs.1 12095 
Hs.284259 
Hs.324123 



Hs.99236 
Hs.99239 
Hs.98279 



AJ224901 Hs.109526 



ESTs 

ATP/GTP-binding protein 
CD3D antigen, delta polypeptide (TiT3 complex) 

ESTs, Moderately similar to T42650 hypothetical protein DKFZp434D0215.1 [H.sapiens] 
ELAV (embryonic lethal, abnormal vision, Drosophila)-like 2 
EST 
EST 
ESTs 

hypothetical protein 
ESTs 

cytochrome P450, subfamily XIX (aromatization of androgens) 
KIAA071 2 gene product 
peptidyl-prolyl isomerase G (cyclophilin G) 
ESTs 

membrane-associated nucleic acid binding protein 
dual specificity phosphatase 10 
ESTs 

ESTs, Weakly similar to S65824 reverse transcriptase homolog [H.sapiens] 
ESTs, Weakly similar to LB4D_HUMAN NADP-DEPENDENT LEUKOTRIENE B4 12- 
ESTs 

KIAA0399 protein 
ESTs 

hypothetical protein DKF2p434F1819 
dynein, axonemal, heavy polypeptide 9 
ESTs 
ESTs 
ESTs 

regulator of G-protein signalling 20 
ESTs 

ESTs, Weakly similar to A43932 mucin 2 precursor, Intestinal [H.sapiens] 
centrosomal P4.1-associated protein; ^characterized bone marrow protein BM032 
zinc finger protein 198 

gb:zx29c03.s1 SoaresJotalJetus_Nb2HF8_9u/ Homo sapiens cDNA clone IMAGE:787876 3', 

hypothetical protein FLJ23251 
ESTs 

gb:zx33a08.s1 Soares_total_fetus_Nb2HF8_9w Homo sapiens cDNA clone IMAGE:788246 3' 

hypothetical protein FLJ 14007 
ESTs 

nuclear respiratory factor 1 
ESTs 
ESTs 

ESTs, Weakly similar to B34087 hypothetical protein [H.sapiens] 
ESTs 

3-hydroxyisobutyryl-Coenzyme A hydrolase 
hypothetical protein FU13409; KIAA1711 protein 
hypothetical protein FLJ 101 60 similar to insulin related protein 2 
ESTs, Weakly similar to T17454 diaphanous-related formin - mouse [M.musculus] 
splicing factor (CC1 .3) 

ESTs, Weakly similar to putative p150 [H.sapiens] 

gb:zx78g01 ,s1 Soares ovary tumor NbHOT Homo sapiens cDNA clone IMAGE:809904 3', 

ESTs 
ESTs 

SMAD in the antisense orientation 
glucose regulated protein, 53kD 
AA317841 Hs.7845 hypothetical protein MGC2752 
AA535244 Hs.78305 RAB2, member RAS oncogene family 
AI365215 Hs.206097 oncogeneTC21 

Homo sapiens cDNA FLJ20738 fis, clone HEP08257 
Homo sapiens mRNA; cDN A DKFZp667N064 (from clone DKFZp667N064) 
KIAA087O protein 
KIAA0665 gene product 

ESTs, Weakly similar to MAPB_HUMAN MICROTUBULE-ASSOCIATED PROTEIN 1B 



AI681654 
AA453525 
AA454085 

AW651706 



Hs.236642 
Hs.30732 
Hs.214397 
Hs.99513 



AA335721 
AA470034 
AA476606 



Hs.1 54974 
Hs.18166 
Hs.1 19004 
Hs.188751 



NM_001938 
AW968504 
AA830335 



Hs.16697 down-regulator of transcription 1,TBP-binding (negative cofactor2) 

Hs.123073 CDC2-related protein kinase 7 

Hs.105273 ESTs 

Hs.1 12157 ESTs 
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131043 RC_AA490925 AF084535 

123259 RC_AA490955 AI744152 
[H .sapiens] 

123284 RC_AA495812 AA488988 

123286 RC_AA495824 AA495824 

123315 RC_M496369 AA496369 



129179 RC_AA504125_s 

131612 RC_M521473 

123421 RC_AA598440 

123449 RC_AA598899_i 

129021 RC_M599244 

132830 RC_AA599694_s 

123497 RCLAA600037 

123604 RC_AA609135 



123712 RC_AA609684 

123731 RCJVA609839 
similar to 

130725 RC_AA609862 

123800 RC_AA620423 

123841 RC_AA620747 

123929 RC_AA621364 

123978 RC_C20653 

133184 RC_D20085 

132835 RC D20749 

132406 RC_D51285 s 

128695 RC_D59972_i 

124028 RC_F04112_f 
sequence. 

124057 RC_F13604 

134899 RC_H01662 

130973 RC_H05135 i 

124106 RC_H12245 

124136 RC_H22842 

124165 RC_H30894 

131229 RC_H43442_s 

124178 RCLH45996 

129948 RCH69281J 

134374 RC_H69485_f 

124254 RC_H69899 



RC_H70627_s 

RC_H73050_s 

RC_H73260 

RC_H77531_s 

RC_H80552 

RC_H80737_s 

RC_H93412 

RC_H94892_s 

RC_H95643_s 

RC_H96552 

RC_H97146 

RC_H99131 s 

RC_H99462_s 

RC_H99837_s 

RC_N22140 

RCJI22197 

RC_N23756_s 

RC_N24134 

RC_N24195 

RC_N26739 

RC_N27098 

RC_N27637 

RC_N33090 



100919 
130724 
100716 
124274 



124315 
100747 
124324 
452933 
132231 
129170 
133143 
132963 
135297 
134347 
130365 
421642 



70 132338 
131403 
124466 
132210 
124483 

75 124484 
124485 



AU076668 
AA598440 
AL049325 
AL044675 
NMJ14777 
AA765256 
AA609135 
T47614 



AI638418 

H12245 

H22842 

H30039 

NMJ15340 

BE463721 

AI537162 



1 RC_N 
7 RC_N38959 f 
3 RC_N39069 
1 RC_N46441 
3 RC_N48270_f 
RC_N48365_s 
RC_N51316 
RC_N51499_s 
RC_N53976 
RC_N54157 
RC_N54300 



N27098 
N27637 
AI193519 
AI364933 



Hs.109154 
Hs.334884 
Hs.291154 
Hs.1 12493 
Hs.173081 
Hs.57730 
Hs.135191 



AA620423 

AA620747 

M621364 

T89832 

AA001021 

Z83844 

AL1 33731 ' 

NM_003478 

F04112 



AI351010 

AW952124 

NM_005402 

X04588 

H96552 

AW391423 

AA662910 

AW250380 

AA094538 



AA280319 
AW450481 
AA353868 
AI473114 
R10084 
NM_007203 
AI821780 
H66118 
AB040933 Hs.15420 



Hs.226396 
Hs.168913 
Hs.6456 
Hs.288840 
Hs.161333 
Hs. 182982 
Hs.26455 
Hs.113319 
Hs.42322 
Hs.179864 



ESTs 

ESTs, Weakly similar to A46010 X-linked retinopathy protein [H.sapiens] 

gb:zv37d10.s1 Scares ovary tumor NbHOT Homo sapiens cDNA clone IMAGE:755827 3' similar 



1 

EST, Weakly similar to 138022 hypothetical protein [H.sapiens] 

Homo sapiens mRNA; cDNA DKFZp564D036 (from clone DKFZp564D036) 

KIM0530 protein 

KIAA0133 gene product 

ESTs, Weakly similar to unnamed protein product [H.sapiens] 
ESTs 

ESTs, Highly similar to pSO katanin [H.sapiens] 
Homo sapiens cDNA: FLJ21543 fis, clone COL06171 

gb:ae62f01.s1 Stratagene lung carcinoma 937218 Homo sapiens cDNA clone IMAGE:951481 3' 



RNA-binding protein gene with multiple splicing 
EST 



Hs.73853 
Hs.321775 
Hs.78580 

Hs.101770 

Hs.107674 

Hs.2450 

Hs.97101 

Hs.263988 

Hs.8236 



ESTs 
ESTs 
ESTs 

thyroid hormone receptor Interactor 8 
hypothetical protein dJ37E16.5 

Homo sapiens mRNA; cDNA DKFZp761C1712 (from clone DKFZp761C1712) 
cullin 5 

gb:HSC2JH062 normalized infant brain cDNA Homo sapiens cDNA clone c-2jh06 3', mRNA 

bone morphogenetic protein 2 

hypothetical protein DKFZp434D1428 

DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 1 

gb:ym17a12.r1 Soares infant brain 1NIB Homo sapiens cDNA clone 3', mRNA sequence 



gb:yu70c12.s1 Weizmann Olfactory Epithelium Homo sapiens cDNA clone IMAGE:239158 3' 

Hs.108336 ESTs, Weakly similarto ALUEJHUMAN !!!! ALU CLASS E WARNING ENTRY III [H.sapiensl 

Hs.278994 Rhesus blood group, CcEe antigens 

Hs.306084 Homo sapiens clone FLB6914 PR01821 mRNA, complete cds 

Hs.172350 HIR (histone cell cycle regulation defective, S. cerevisiae) homolog A 

Hs.102249 EST 

Hs.102267 lysosomal 

Hs.13094 presenilins associated rhomboid-like protein 

Hs.288757 v-ral simian leukemia viral oncogene homolog A (ras related) 

Hs.85844 neurotrophic tyrosine kinase, receptor, type 1 

Hs.159472 Homo sapiens cDNA: FLJ22224 fis, clone HRC01703 

Hs.288555 Homo sapiens cDNA: FLJ22425 fis, clone HRC06666 

Hs.42635 hypothetical protein DKFZp434K2435 

Hs.109059 mitochondrial ribosomal protein L12 

Hs.272808 putative transcription regulation nuclear protein; KIAA1 689 protein 

Hs.34851 epsilon-tubulin 

Hs.300208 Sec23-interacting protein p125 

Hs.82042 solute carrier family 23 (nucleobase transporters), member 1 

Hs.155103 eukaryotic translation initiation factor 1A, Y chromosome 

Hs.106346 retinoic acid repressible protein 

Hs.151945 mitochondrial ribosomal protein L43 

Hs.102463 EST 



hypothetical protein FU11126 

serine/threonine kinase 24 (Ste20, yeast homolog) 

chaperonin containing TCP1, subunit 2 (beta) 

PR01575 protein 

ESTs 

golgin-67 

ESTs 

kinesin heavy chain member 2 
A kinase (PRKA) anchor protein 2 
ESTs 

ESTs, Weakly similar to 21 09260A B cell growth factor [H.sapiens] 
KIAA1500 protein 
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124494 RC_N54831 N54831 



N79264 

N52375 

AA903424 

D54120 

AI301740 



124527 RC N62132 

124532 RC N62375 

133213 RC_N63138 

124539 RC_N63172 

133651 RC N63772 

129196 RCJI63787 

124575 RC N68168 

124576 RC_N68201 

124577 RC_N68300 
mRNA 

124578 RC N68321 
124593 RC N69575 
128501 RCJ175007 
105691 RC N75542 
128473 RC N90066 
128639 RC_N91246 
124652 RC N92751 
133137 RC_N93214_s 
124671 RC.N99148 
PROTEIN 

133054 RC_R07876 



Hs.271381 
Hs.13565 
Hs.269104 
Hs.102731 
Hs.6786 
Hs.146409 
Hs.173381 



ESTs, Weakly similar to 138022 hypothetical protein [H.sapiens] 
Sam68-like phosphotyrosine protein, T-STAR 
ESTs 
EST 



N68168 
N68201 
N68300 

N68321 

N69575 

AL1 33572 

AI680737 

T78277 

AW582962 

W19407 



AK001357 



ESTs 

cell division cycle 42 (GTP-binding protein, 25kD) 
dihydropyrimidinase-like 2 

ESTs, Weakly similar to 138022 hypothetical protein [H.sapiens] 

gb:za1 1c01 .s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone 3', mRNA sequence 
Hs.269124 ESTs, Weakly similar to 138022 hypothetical protein [H.sapiens] 

gb:za12g07,s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAG&292380 3', 

Hs.231500 EST 
Hs.102788 ESTs 

protein containing CXXC domain 2 
Homo sapiens cDNA FLJ11918 fis, clone HEMBB1 000272 

O-linked N-acetyiglucosamine (GlcNAc) transferase (UDP-N-acetylglucosamine:polypeptide-N- 
CGI-47 protein 

regulator of nonsense transcripts 2; DKFZP434D222 protein 
KIAA031 8 protein 

Homo sapiens cDNA FLJ10495 fis, clone NT2RP2000297, moderately similarto ZINC FINGER 



Hs.100293 

Hs.102897 

Hs.3862 

Hs.65746 

Hs.102951 



AA464836 Hs.291079 ESTs, Weakly similar to T27173 hypothetical protein Y54G11A.9 ■ 
Hs.155421 



130410 RC_R10865_f J00077 

124720 RC.R11056 R05283 
similarto 

124722 RC_R11488 T97733 Hs.185685 

129961 RC R22947 R23053 

repetitive element 128944 RC_R23930_s AL137586 

132965 RC_R26589_f AI248173 Hs.191460 

133740 RC_R37588_s AW162919 Hs.170160 

133074 RC.R37613 AL134275 

124757 RC R38398 H11368 

124762 RC_R39179_f AA553722 

124773 RC_R40923 R45154 

135266 RC_R41179 R41179 

131375 RC_R41294_s AW293165 Hs.143134 

133753 RCR42307J NMJ04427 "" 



Hs.6434 

Hs.141055 

Hs.92096 

Hs.106604 

Hs.97393 



gb:yh31a05,r1 Soares placenta Nb2HP Homo sapiens cDNA clone 5' similarto contains L1 

Hs.52763 anaphase-promoting complex subunit 7 

hypothetical protein MGC12936 

RAB2, member RAS oncogene family-like 

hypothetical protein DKFZp761F2014 



.f 

124785 RC_R43306 

124792 RC.R44357 

124793 RC.R44519 
sequence. 

1247S9 RC_R45088 
sequence. 

124812 RC.R47948J 
124821 RC_R51524 
127274 RC.R54950 
124835 RC_R55241 
124845 RC_R59585 
124847 RC_R60044 
440630 RC R60872 



Hs.328317 
Hs.280740 
Hs.48712 



R47948 
H87832 
AW966158 
R55241 



Hs.188732 

Hs.7388 

Hs.58582 

Hs.101214 

Hs.101255 

Hs.304177 

Hs.239388 



ESTs, Moderately similar to A46010 X-linked retinopathy protein [H.sapiens] 
ESTs 

KIAA0328 protein 
ESTs 

early development regulator 2 (homolog of polyhomeotic 2) 
EST 

hypothetical protein MGC3040 
hypothetical protein FLJ20736 

gb:yg24h04.s1 Soares infant brain 1NIB Homo sapiens cDNA clone IMAGE:33350 3', mRNA 
gb:yg38g04.s1 Soares infant brain 1NIB Homo sapiens cDNA clone IMAGE:34896 3', mRNA 
ESTs 

kelch (Drosophiia)-like 3 

Homo sapiens cDNA FLJ12789 fis, clone NT2RP2001947 



130141 RC_R67266_s NM_004455 Hs.150956 



124879 RC R73588 

124892 RC_R79403 AI970003 

124906 RC_R87647 H75964 

124922 RC R93622 R93622 

124940 RC R99599_s AF068846 

124941 RC R99612 AI766661 
124943 RC T02888 AW963279 
WARNING ENTRY [H.sapiens] 



Hs.101533 ESTs 



124947 ROJ03170 

124954 RC_T10465 

132924 RC_T15418_f 

133113 RC_T15597_f 

132975 RC_T15652_i 

133235 RC_T16898_s 

131082 RC_T26644_i 

124980 RC_T40841 

124984 RC_T47566_i 

124991 RC_T50116 



T03170 
AW964237 
U55184 



R43504 
AW960782 
AI091121 

T40841 Hs.98681 

BE313210 Hs.223241 
T50116 

to similar to SP:VE22_LAMBD P03756 EA22 GENE , mRNA sequence. 

129475 RC_T50145_s NMJ04477 Hs.203772 



..... Dn initiation factor 2, subunit 2 (beta, 38kD ) 

heterogeneous nuclear ribonucleoprotein U (scaffold attachment factor A) 
Hs.27774 ESTs, Highly similar to AF161349 1 HSPC086 [H.sapiens] 
Hs.123373 ESTs, Weakly similarto ALU1.HUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 

Hs. 1001 65 ESTs 

Hs.6728 KIAA1548 protein 

Hs.154145 hypothetical protein FLJ11585 

Hs.65238 95 kDa retinoblastoma protein binding protein; KIAA0661 gene product 
Hs.6181 ESTs 

Hs.6856 ash2 (absent, small, or homeotic, Drosophila, homolog)-like 
Hs.246218 Homo sapiens cDNA: FLJ21781 fis, clone HEP00223 
Hs.98681 ESTs 

eukaryotlc translation elongation factor 1 delta (guanine nucleotide exchange protein) 
gb:yb77c10.s1 Siratagene ovary (937217) Homo sapiens cDNA clone IMAGE.77202 3' similar 
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125000 ROJ58615 T58615 

132932 RC_T59940_f AW118826 

129534 RC T63595 AK002126 

125008 ROJ64891 T91251 

125009 RC T64924 T64924 
132940 RC_T64933_r T79136 
125017 RCJ68875 T68875 



Hs.1 10640 ESTs 

Hs.6093 Homo sapiens cDNA: aJ22783 fis, clone KAIA1993 
Hs.11260 hypothetical protein FLJ11264 

gb:yd60a10.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone 3', mRNA sequence 



Hs.127243 Homo sapiens mRNA for KIM1724 protein, partial cds 

i bbo/o gb:yc30f05.s1 Stratagene liver (937224) Homo sapiens cDNA clone IMAGE:82209 3', mRNA 

T69027 Hs.57475 sex comb on midteg homolog 1 

T69981 gb:yc19d03,r1 Stratagene lung (937210) Homo sapiens cDNA clone 5', mRNA sequence 

AI084813 Hs.13197 ESTs 

AI873257 Hs.7994 hypothetical protein FLJ20551 

AW97O209 Hs.1 11805 ESTs 

T85104 Hs.222779 ESTs, Moderately similar to similar to NEDD4 [H.sapiens] 

T80622 Hs.268601 ESTs, Weakly similar to envelope [H.sapiens] 

T85352 gb:yd82d0ts1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE.-1 14721 3' 
similar to contains Alu repetitive element;contains L1 repetitive element ;, mRNA sequence. 

125064 RC.J85373 T85373 gb:yd82f07.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:114757 3' 
similar to contains Alu repetitive element;contains MER3 repetitive element ;, mRNA se 
125066 RCJ86284 T86284 " 
Alu repetitive element;, mRNA sequence 
112264 RC_T89579_s AL045364 Hs.79353 



125018 RCJ69027 

125020 RC_T69924 

129891 RCJ70353 

134204 RC_T79780_s 

125050 RCJ79951 

125052 RC_T80174_s 

125054 RCJ80622 

125063 RC_T85352 



gb:yd77b07.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone 3' similar to contains 
transcription factor Dp-1 

ESTs, Highly similar to ALU6_HUMAN ALU SUBFAMILY SP SEQUENCE CONTAMINATION 
WARNING ENTRY [hf.sapiens] 

■ AW576389 Hs.335774 EST, Moderately similar to S65657 alpha-1C-adrenergic receptor splice form 2 [H.sapiens] 

gb:ye40a03.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone 3' similar to 



125104 RCJ95590 

gb|M10817|IGURRAA Iguana iguana 5S (rRNA );, mRr- 
135107 RC T97257J T97257 Hs.337531 
129550 RC T97599J AA845462 
125118 RC T97620 



Hs.100717 EST 
Hs.79432 



W38240 
AA247778 

AW453069 Hs.3657 
W93127 



similar to contains Alu repetitive element;, mRNA sequence. 

125120 RC T97775 " 

134160 RC T98152 

125136 RCJA/31479 

125144 RCJV37999 

125150 RC W38240 

104180 RC_W40150 

131987 RC W45435 

125178 RC.W58202 

125180 RCJV58344 

125182 RC_W58650 

130588 RC_W68736 

125197 RCJV69106 

133497 RC_W69111 

100562 RC W69385 

125639 RC_W69399. 

129232 RC W69459 

101495 RC_W72424 

125209 RC W72724 

125212 RC W72834 

129132 RC.W73955 

125223 RC.W74701 
WARNING ENTRY [H.sapiens] 



ESTs, Moderately similar to I38022 hypothetical protein [H.sapiens] 
Hs.124024 deltex (Drosophila) homolog 1 

gb:yf35f11.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:128877 3' 



fibrillin 2 (congenital contractural arachnodactyly) 
ESTs 

KIAA1321 protein 

Empirically selected from AFFX single probeset 
Hs.1 19155 Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 814975 
activity-dependent neuroprotective protein 



Hs.31845 

Hs.103120 ESTs 
AA451755 Hs.263560 
Hs.16411 
Hs.278554 
BE617303 Hs.74266 



AL030996 



ESTs 

hypothetical protein LOC57187 
heterochromatin-like protein 1 
hypothetical protein MGC4251 



ls.301512 nuclear mitotic apparatus protein 1 

Hs.226117 H1 histone family, memberO 

Hs.109655 sex comb on midleg (Drosophila)-like 1 

Hs.1 12405 S100 calcium-binding protein A9 (calgranulin B) 

Hs.1 03174 ESTs, Weakly similar to TSP2.HUMAN THROMBOSPONDIN 2 PRECURSOR [H.sapiens] 

Hs.103173 ESTs 

Hs. 1 08847 hypothetical protein MGC2749 

AI916269 Hs.109057 ESTs, Weakly similar to ALU5_HUMAN ALU SUBFAMILY SC SEQUENCE CONTAMINATION 



W72424 
W72724 
AA746225 



W74169 Hs.16492 



125225 RC_W76540 
125228 RCJV79397 
132393 RC W85888 
125238 RC_W86038 
125247 RC_W86881 
129296 RC_W87804 
125263 RC W88942 
125266 RCJV90022 W9UU;h 
PRECURSOR [H.sapiens] 
131321 RC W92272 U91543 
131601 RC_W92764_s NM_007115 Hs.29352 
131677 RC W93040 H05317 



AL1 35094 

N99713 

AA694191 

AI051967 

AA098878 



DKFZP564G2022 protein 

ESTs, Weakly similar to I36022 hypothetical protein [H.sapiens] 
Hs.47334 hypothetical protein FLJ14495 
Hs.109514 ESTs 
Hs.163914 ESTs 
Hs.1 10122 ESTs 

gb:zn45g10.r1 Stratagene Hela cell s3 937216 Homo sapiens cDNA clone 5', mRNA sequence 
Hs.186809 ESTs, Highly similar to LCT2_HUMAN LEUKOCYTE CELL-DERIVED CHEMOTAXIN 2 

Hs.25601 



120837 RC_W93092 

125277 RC W93227 

125278 RC_W93523 
0 RC V\ 



W93227 
AI218439 
AI123705 
W93949 
AI419294 



Hs.33245 



131856 RC_W94003. 

131844 RC_W94401. 

125284 RC_W94688 

313447 RC W94787_s AW016321 Hs.82306 

130799 RC_Z38294_s AB028945 Hs.12596 

125289 RC Z38311 T34530 Hs.4210 

128874 RC_Z38465_s H06245 Hs.106801 



chromodomain helicase DNA binding protein 3 
tumor necrosis factor, alpha-induced protein 5 
HS.283549 ESTs 

Hs.306621 Homo sapiens cDNA FLJ1 1963 fis, clone HEMBB1001051 
Hs.1 03245 EST 

enhancer of polycomb 1 
ESTs 

ESTs , 



destrin (actin depolymerizing factor) 

cortactin SH3 domain-binding protein 

Homo sapiens cDNA FU 13069 fis, clone NT2RP3001752 

ESTs, Weakly similar to PC4259 ferritin associated protein [H.sapiens] 
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13096S RC_Z38525_s 
128875 RC Z38538J 
133200 RC_Z38551_s 
130158 RC_238783_s 
125295 RC_Z39113 
domain, (semaphore) 4F 
125298 RC_Z39255_f 
125300 RC_Z39591 
323122 RC_Z39783_s 
311463 RC_Z39920 
130882 RC_Z40166J 
128888 RC_Z40388_s 
125310 RC_Z40646 
125315 RC_Z41697 
125317 RC_Z99349 
135096 RC_Z99394_s 
104786 RC AA027168 
132837 D58024.S 
120456 RC AA251113 
132459 RC_AA347573 
101545 M31210 
133505 C01527 
132360 RC_N62948_s 
132738 RCJV42674 
119586 RC_W43000_s 
129914 RC_N31750_s 
130839 AF009301 
132813 L37347 
134342 M99564 
131878 RC_AA430673 
105426 RCAA251297 
132968 RC_M620722 
132173 RC_W46577_s 
113932 RCW81237 
114452 RC_AA020825 
PROTEIN TC10 
115243 RC.AA278766 
134403 RC_H937C8_s 
129647 RCJI49394 
111428 RC_H56559_s 
115967 RCJ\A446887 
120726 RC_AA293656 
114995 RC_AA251152 
303876 RC_AA233334_s 
dominant, ataxin 3) 
311463 RC_Z39920 
120302 RC AA192173 
133071 RC_AA455044 
121032 RC_AA398504 
129829 U41813 
120245 RC_AA166965 
120985 RCAA398222 
114184 RC_Z39095 
447503 RC_AA284744_f 
cds 

132837 RC_AA428201 
121034 RC_M398507 
119718 RC_W69216 
120455 RC_AA251083 
125280 RC_W93659 
132155 RC_AA227903 
120609 RC_AA283902 
121278 RC_AA401631 
109023 RC_jAA157293 
129815 RC_D60208_f 
108061 RC AA043979 
113287 ROJ66847 
114082 RC_Z38239 
116334 RC_AA491457 
131486 RC_Z40071_s 
107860 RC_AA024961 
131263 RC_AA443826 
132207 RC AA443294 
129183 RC_AA155743 
408431 RCJ23708 
120575 RC AA280934 



AB037715 
AB032947 
AB022317 



AA081258 
AA027167 
M370362 
AA488750 
AL120071 
BE246154 
AI630124 



AK000738 

AF088033 

NM-012421 

AB011169 

BE313625 

NM_000275 

AA083764 

W20027 

AF234532 



AW971018 Hs.21659 
Hs.106808 
Hs.183639 
Hs.151301 
Hs.25887 



Hs,289008 

Hs.101376 

Hs.264915 

Hs.22142 

Hs.20887 

Hs.241558 

Hs.124953 

Hs.1 06296 

Hs.1 12461 

Hs.132390 

Hs.10031 

Hs.57958 

Hs.88414 

Hs.48998 

Hs.1 54210 

Hs.324504 

Hs.46440 

Hs.264636 

Hs.1 59225 

Hs.1 3321 

Hs.20141 

Hs.57435 

Hs.82027 

Hs.6101 

Hs.23439 

Hs.61638 

Hs.41716 

Hs.126485 

Hs.243010 

Hs.1 16665 
Hs.82767 
Hs.118140 
Hs.174174 
Hs.42911 
Hs.97293 
Hs.193657 



Hs.22142 
Hs.269933 
Hs.64313 
Hs.161798 
Hs.127428 
Hs.111045 
Hs.97592 
Hs.21062 
Hs.336898 

Hs.57958 
Hs.271623 
Hs.92848 
Hs.104347 



AK001612 Hs.26962 

AL038450 Hs.48948 

F06972 Hs.27372 

AA024961 Hs.50730 

AU077002 Hs.24950 



AA334551 
AB018259 
AL031428 



AI219896 
R56434 
AA1 15496 



AL389951 

W69216 

AA251720 

AI123705 

AK001607 

AW978721 

AA037121 

AA157293 

BE565817 



ESTs 

kelch (Drosophila)-like 1 
hypothetical protein FU10210 
Ca2+-dependent activator protein for secretion 
sema domain, immunoglobulin domain (Ig), ' 



Homo sapiens cDNA: FU21814 fis, clone HEP01068 



BE561824 
AI338631 
AW978022 



EST 

Homo sapiens cDNA FLJ12908 fis, clone NT2RP2004399 

cytochrome b5 reductase b5R.2 

hypothetical protein FU 10392 

ariadne (Drosophila) homolog 2 

ESTs 

ESTs 

ESTs, Weakly similar to 138022 hypothetical protein [H.sapiens] 
zinc finger protein 36 (KOX18) 
KIAA0955 protein 
EGF-TM7-latraphilin-re!ated protein 

BTB and CNC homology 1, basic leucine zipper transcription factor 2 

fibronectin leucine rich transmembrane protein 2 

endothelial differentiation, sphingolipid G-protein-coupled receptor, 1 

Homo sapiens mRNA; cDNA DKFZp586J0720 (from clone DKFZp586J0720) 

solute carrier family 21 (organic anion transporter), member 3 

hypothetical protein FLJ20731 

ESTs 

rearranged L-myc fusion sequence 
similartoS.cerevisiae SSM4 

solute carrierfamily 1 1 (proton-coupled divalent metal ion transporters), member 2 

oculocutaneous albinism II (pink-eye dilution (murine) homolog) 

hypothetcal protein MGC3178 

ESTs 

myosin X 

endothelial cell-specific molecule 1 

hypothetical protein FLJ12604; KIAA1692 protein 

Homo sapiens cDNA FLJ14445 fis, clone HEMBB1001294, highly similar to GTP-BINDING 

KIAA1842 protein 

sperm specific antigen 2 

KIAA0716 gene product 

KIAA0601 protein 

ESTs 

ESTs 

ESTs 

Machado-Joseph disease (spinocerebellar ataxia 3, olivopontocerebellar ataxia 3, autosomal 

cytochrome b5 reductase b5R.2 
ESTs 

ESTs, Weakly similar to AF2571 B2 1 G-protein-coupled receptor 48 [H.sapiens] 
ESTs 

homeoboxA9 
ESTs 
ESTs 
ESTs 

Homo sapiens, Similar to RIKEN cDNA 1810038N03 gene, clone MGC:9890, mRNA, complete 

EGF-TM7-latrophilin-related protein 

nucleoporin 50kD 

ESTs 

ESTs, Weakly similar to ALUC.HUMAN HI! ALU CLASS C WARNING ENTRY !!! [H.sapiens] 
ESTs 

hypothetical protein FU13220 

ESTs, Weakly similar to A46010 X-linked retinopathy protein [H.sapiens] 

Homo sapiens cDNA FLJ11490 fis, clone HEMBA1001918 

ESTs 

hypothetical protein FLJ21657 
EST 

ESTs, Weakly similar to 138022 hypothetical protein [H.sapiens] 
Homo sapiens cDNA FLJ10750 fis, clone NT2RP3001929 
ESTs 

BMX non-receptort/rosine kinase 
ESTs 

regulator of G-protein signalling 5 
E2F transcription factor6 

uncharacterized hematopoietic stem/progenitor cells protein MDS027 
Homo sapiens cDNA: FLJ22536 fis, clone HRC13155 
hypothetical protein DKFZp762E1511; KIAA1816 protein 
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132121 
117657 
134922 
118523 
116845 
115291 
120326 
130174 
129131 



115985 
■134637 
15 132714 
129771 
123360 



20 



RCjM443284_! 

RC N39074 

RC W04507_s 

RC_R41828_s 

RC_H64973 

RC AA279943 

RC AA196300 

M29550 

RC AA436489 

RC_AA287032 

RCJI70777 

RC_AA496921 

RC_AA447709 



AI718295 
Y07759 
AA649530 
BE545072 



30 



RC_AA252598 

RC.H73237 

RC_AA504784 



113716 



120541 
116727 
118219 
25 119767 
128917 
451553 
132716 



RC_T97750 

RC.W48860 

RC_Z38501 

RC.AA278298 

RC.F13684 

RC.N62231 

RC_W72562 

RC.AA481252 

RC_AA020928 

RC.AA251288 

RC_N67861 

RC_M084162 

RC W70242 

RC.AA425151 

RC AA460324 

U44378 

RC.W74471 

RC AA435842 

RC.AA243017 

RC_N53367 

RC_AA490227 

M63154 



AW1 72431 
AL137554 
AF010258 
AA447709 
i U87309 
W39388 
AL096748 
AA532718 
AI936442 
AA001356 
AW014486 
AL135301 
W07318 
R76472 
AA862391 
W72562 
AI365215 
AA018454 



Hs.404 1 
Hs.44933 
Hs.91161 
Hs.170157 

Hs.122579 
Hs.21145 
Hs.151531 
Hs.177534 
Hs.13012 



myeloid/lymphoid or mixed-lineage leukemia (trithorax (Drosophila) homolog); translocated to 



myosin VA (heavy polypeptide 12, myoxin) 

gb:ns44f05,s1 NCI_CGAPj\lv1 Homo sapiens cDNA clone, mRNA sequence 
hypothetical protein FLJ 10461 



114618 
119743 
108154 
122798 
133746 
119822 
122186 
114941 
118053 
123234 
129280 



116750 RC.H05960 



N67861 
AW979261 
AA947552 
NM_005754 



Hs.55336 

Hs.102708 

Hs.178604 

Hs.59838 

Hs.18159 

Hs.22509 

Hs.8768 

Hs.240 

Hs.65646 

Hs.48494 

Hs.58119 

Hs.206097 

Hs.269211 

Hs.283738 



AW410035 Hs.75862 
AF086409 Hs.301327 
Hs.104673 
Hs.87331 
Hs.47629 
Hs.16697 
Hs.110014 
Hs.323056 



Hs.108043 

Hs,301957 

Hs.103446 

Hs,4190 

Hs.37706 



AA398811 

AA236512 

N53391 

NMJ01938 

M63154 



105127 RC_AA158132 
114513 RC_AA044825 AA044873 
411856 RC_T35697 H67899 
132036 W01568 
130091 RC_W88999 
sequence 
414108 U09564 
119881 RCJV81456 
117770 RC_N47953 
119850 RCJV80447 
115439 RC_AA284561 
123107 RC_AA486071 
405698 M24364 
121231 RC_AA400780 
132074 AB002366 
413670 AB000115 
125277 RC_W93227 
114056 RC_AA186324 
121153 RC AA399640 AA399640 
121609 RC_AA416867 M416867 
120661 RC_AA287556 AA287556 
120850 RC_AA349647 AA349647 
124947 RC T03170 T03170 
130529 RC.AA280885 AA178953 
repetitive element, mRNA sequence 
117683 RC_N40180 N40180 
IMAGE:276387 3' similar to contains L1.t1 
120745 RC_AA302809 AA302809 
120936 RC_AA385934 AA385934 
112597 RC.R78376 R78376 
120183 RC_240174 AW082866 
120644 RCJ\A287038 AI869129 



W81486 

AW957372 

AI247568 

AI567972 

AA225048 

X03068 

AA814948 

AA478486 

AB000115 

W93227 

AA188175 



10 



3), catalytic subunit, beta isoform (calcineurin A beta) 



ESTs 

protein kinase NYD-SP15 
homeo boxA9 

ESTs, Weakly similar to T08599 probable transcription factor CA150 [H.sapiens] 

vacuolar protein sorting 41 (yeast homolog) 

Homo sapiens, clone MGC:17421, mRNA, complete cds 

DKFZP434A043 protein 

ESTs 

hypothetical protein FLJ 10808 

ESTs 

ESTs 

hypothetical protein FLJ10849 
M-phase phosphoprotein 1 
ESTs 

ESTs, Moderately similar to A46010 X-linked retinopathy protein [H.sapiens] 



ESTs 

Ras-GTPase-actlvating protein SH3-domain-binding protein 
splicing factor (CC1.3) 

MAD (mothers against decapentaplegic, Drosophila) homolog 4 

ESTs 

ESTs 

ESTs 

ESTs 

down-regulator of transcription 1, TBP-binding (negative cofactor2) 

gastric intrinsic factor (vitamin B synthesis) 

ESTs 

ESTs 

Friend leukemia virus integration 1 

nudix (nucleoside diphosphate linked moiety X)-type motif 5 

ESTs 

Homo sapiens cDNA: FU23269 fis, clone COL09533 
hypothetical protein DKFZp434E2220 

gb:zh70h03.s1 Soares_fetal_liver_spleen_1NFLS_S1 Homo sapiens cDNA clone 3', n 

SFRS protein kinase 1 
ESTs 

ESTs, Weakly similar to I38022 hypothetical protein [H.sapiens] 
ESTs 

ESTs, Highly similar to AF161437 1 HSPC31 9 [H.sapiens] 



Hs.75761 
Hs.58648 
Hs.46791 
Hs,58452 
Hs.193090 

Hs.104207 ESTs 

Hs.73931 major histocompatibility complex, class II, DQ beta 1 

Hs.96343 ESTs, Weakly similar to ALUCJHUMAN !!!! ALU CLASS C WARNING ENTRY III [H.sapiens] 

Hs.3852 KIAA0368 protein 

Hs.75470 hypothetical protein, expressed in osteoblast 

Hs.103245 EST 

Hs.82505 KIAA1 254 protein 

Hs.97694 ESTs 

Hs.98185 EST 

Hs.263412 ESTs, Weakly similar to ALU3_HUMAN !!!! ALU CLASS B WARNING ENTRY !!! [H.sapiens] 

Hs.96927 Homo sapiens cDNA FLJ12573 fis, clone NT2RM4000979 

Hs.100165 ESTs 

gb:zp39e03.s1 Stratagene muscle 937209 Homo sapiens cDNA clone 3' similar to contains Alu 

gb:yy44d02.s1 Soares_multiple_sclerosis_2NbHMSP Homo sapiens cDNA clone 
L1 repetitive element ;, mRNA sequence. 

gb:EST10426 Adipose tissue, white I Homo sapiens cDNA 3' end, mRNA sequence. 
Hs.97184 EST, Highly similar to (defline not available 7499603) [C.etegans] 
Hs.29733 EST 
Hs.65882 ESTs 
Hs.96616 ESTs 
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119023 RC_N98488 N98488 
IMAGE:310129 3', mRNA sequence. 

107582 RC_AA002147 AA002147 

118249 RC_N62580 N62580 

5 115022 RC AA252029 AA252029 

117710 RC_N45198 N45198 

115341 RC_AA281452 AA281452 

118896 RCJI90680 N46213 

121121 RC_AA399371 AA399371 

10 118329 RC.N63520 N63520 
3', mRNA sequence. 
119496 RC_W35416 
118111 RC.N55493 
mRNA sequence. 

15 119062 RC_R16698 

116710 RC_F10577_f F10577 
119261 RC_T15956 

122723 RC_AA457380 AA457380 



Hs.59952 
Hs.322925 
Hs.87935 
Hs.47248 



W35416 



gb:zb82h01 .s1 Soares_senescenLfibroblasts_NbHSF Homo sapiens cDNA clone 
EST 

EST, Weakly similar to putative p150 [H.sapiens] 
ESTs 

ESTs, Highly similarto similar to Cdc14B1 phosphatase [H.sapiens] 
EST, Weakly similar to granule cell marker protein [M.musculus] 
methionine adenosyitransferase II, beta 
similar to SALL1 (sal (Drosophlla)-like 
gb:yy62f01.s1 Soares_multiple_sclerosis_2NbHMSP Homo sapiens cDNA clone IMAGE:278137 

Hs.156861 ESTs, Moderately similarto A46010 X-linked retinopathy protein [H.sapiens] 

gb:yv50c02,s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA clone IMAGE:246146 3', 



AW444881 Hs.77829 
Hs.306088 



gb:aa86b10.s1 Stratagene fetal retina 937202 Homo sapiens cDNA clone IMAGE:838171 3' 
similarto contains L1.b3 L1 repetitive element ;, mRNA sequence. 
117732 RC.N46452 N46452 gb:yy76h09,s1 Soares_multiple_sclerosis_2NbHMSP Homo sapiens cDNA clone 

IMAGE:279521 3' similar to contains L1 ,t2 L1 repetitive element ;, mRNA sequence. 

104787 RC_AA027317 AA027317 gb:ze97d11.s1 Soares_fetal_hearLNbHH19W Homo sapiens cDNA clone IMAGE:366933 3' 

similarto contain Alu petit t e c ment;, mRNA sequence. 



100071 A28102 A28102 



115819 RC AA426573 


AA486620 


Hs.41135 


130882 RC Z40166J 


AA497044 


Hs.20887 


125225 RC W76540 


W74169 


Hs.16492 


108339 RCAA070801 


AW151340 


Hs.51615 


WARNING ENTRY [H.sapiens] 




100338 D63483 


D86864 


Hs.57735 


121636 RCJ\A417027 


AA379203 


Hs.306654 


103875 RC_AA418387 


T26379 


Hs.48802 


118716 RC_N73460 


AI658908 


Hs.1 18722 


119763 RC W72450 


R54146 


Hs.10450 


121917 RC_AA428218 


AA406397 


Hs.98038 


132806 M91488 


AI699432 


Hs.278619 


130949 Y10659 


AV656840 


Hs.285115 


108806 RC.AA129933 


AF070578 


Hs.71168 


133276 RC AA490478 


AW978439 


Hs.69504 


134760 RC_H16758 


NM 000121 


Hs.89548 


132867 AA121287 


AF226667 


Hs.58553 


132051 AA091284 


AA393968 


Hs.180145 


114208 RC_Z39301 


AL049466 


Hs.7859 


104094 AA418187 


AA418187 


Hs.330515 


128718 AA426361 


NM 002959 


Hs.281706 


302032 RC N20407 


NM 001992 


Hs. 128087 


115501 RC AA291553 


AA291553 


Hs.190086 


101997 U01160 


AU076536 


Hs.50984 


103708 AA037206 


AA430591 


Hs.72071 


101899 S59184 


S59184 


Hs.79350 


115839 RCJ\A429038 


BE300266 


Hs.28935 


409459 D50678 


D86407 


Hs.54481 


103563 Z22534 


L02911 


Hs. 150402 


123233 RC AA490225 


AW974175 


Hs.188751 



Human GABAa receptor alpha-3 subunit 
endomucin-2 

hypothetical protein FLJ10392 



55 [H.sapiens] 



121305 RC AA402468 AA402468 

114798 RCAA159181 AA159181 

133145 RCJ\A196549 H94227 

131567 RC_AA291015_s AF015592 

112300 RC_R54554 H24334 

129507 RC.AA192099 AJ236885 

121033 RC_AA398505 AA398505 

121151 RC_AA399636 AA399636 

121402 RC.AA406063 AA406063 

123203 RC AA489671 AA352335 

132271 RC.AA236466 AB030034 

125197 RCJV69106 AF086270 

114935 RC_AA242809 H23329 
WARNING ENTRY [H.sapiens] 

125279 RC.W93640 AW401809 

108778 RC_AA128548 AF133123 

108087 RC_AA045709 AA045708 

132466 RC_N66810 s AI597655 

133328 R3S553 AW452738 

124057 RC_F13604 AA902384 

124800 RC_R45115 



Hs.112180 
Hs.97360 
Hs.143629 
Hs.98003 
Hs.65641 
Hs.1 15175 
Hs.278554 



ESTs, Weakly similarto ALU7_HUMAN ALU SUBFAMILY SQ SEQUENCE CONTAMINATION 
acetyl LDL receptor; SREC 

Homo sapiens cDNA FU 13574 lis, clone PLACE10D8625 
Homo sapiens clone 23632 mRNA sequence 
fucosyltransferase 8 (alpha (1 ,6) fucosyltransferase) 
Homo sapiens cDNA: FLJ22063 fis, clone HEP10326 
ESTs 

hypothetical protein FLJ 10099 
interleukin 13 receptor, alpha 1 
Homo sapiens clone 24674 mRNA sequence 
ESTs 

erythropoietin receptor 
CTP synthase II 
HSPC030 protein 
ESTs 
ESTs 
sortilin 1 

coagulation factor II (thrombin) receptor 
ESTs 

sarcoma amplified sequence 
hypothetical protein FLJ20038 
RYK receptor-like tyrosine kinase 

transducin-like enhancer of split 1 , homolog of Drosophila E(sp1) 
low density lipoprotein receptor-related protein 8, apolipoprotein e receptor 
Hs. 1 50402 Activin A receptor, type I (AC VR 1) (ALK-2) 

' ESTs, Weakly similarto MAPBJHUMAN MICROTUBULE-ASSOCIATED PROTEIN 1B 

ESTs 

serologically defined colon cancer antigen 1 
Homo sapiens, clone IMAGE:2961368, mRNA, partial cds 
CDC7 (cell division cycle 1, S. cerevisiae, homolog)-like 1 
ESTs 

zinc finger protein 148 (pHZ-52) 
ESTs 
ESTs 
ESTs 

hypothetical protein FU20073 

sterile-alpha motif and leucine zipper containing kinase AZK 
heterochromatin-like protein 1 

ESTs, Weakly similar to ALU1.HUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 
Hs.4779 KIAA1150 protein 

Hs.90847 general transcription factor IIIC, polypeptide 3 (102kD) 
Hs.40545 ESTs 
Hs.49265 ESTs 

Hs.265327 hypothetical protein DKFZp761l141 
Hs.73853 bone morphogenetic protein 2 
Hs.138617 thyroid hormone receptor interactor 12 
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Hs.146343 
Hs.122647 
Hs.101590 
Hs.107815 
Hs.67624 
Hs.86022 
Hs.222779 
Hs.52184 



121029 RC_M398482 AA398482 

120663 RC M287627 AA827798 

102133 U15173 AU076845 

108246 RC AA062855 AI423132 

125226 RC.W78134 AA782536 

120260 RCJA171739 AK000061 

124906 RC_R87647 H75964 

109406 RC_AA226877 AA1 99883 

109271 RC_AA195668 AW137422 

125052 RC T80174 s T85104 

109101 RC_AA167708 AW608930 

115241 RC_AA278723 AA648278 

117163 RC_H97909 N36851 

113530 RC_T90313 T90313 

120375 RC_AA227260 AF028706 

129435 AA314256 AF151852 
114864 RC_AA235256 
103988 AA314389 

131006 RCLAA242763 AF064104 

106781 RC_AA478474 AA330310 

106141 RC_AA424558 AF031463 

116213 RC_AA476738 AA292105 

135266 AB002326 R41179 

135058 RC_AA430152 AI379720 Hs.9^ 



Hs.97641 EST 
Hs. 105089 ESTs 



AA314389 



Hs.42344 
Hs.16732 
Hs.111227 
Hs.1 11449 
Hs.71608 
Hs.42500 
Hs.22116 
Hs.24181 
Hs.9302 
Hs.326740 
Hs.97393 



BCL2/adenovirus E1B 19kD-interacting protein 2 
ESTs 

N-myristoyltransferase 2 

hypothetical protein 

ESTs 

ESTs 

ESTs 

ESTs, Moderately similar to similar to NEDD-4 [H.sapiens] 

hypothetical protein FLJ20618 

ESTs 

ESTs 

ESTs 

Zio family member 3 (odd-paired Drosophila homolog, heterotaxy 1) 

CGI-94 protein 

ESTs 

ADP-ribosylation factor-iike 5 

CDC14 (cell division cycle 14, S. cerevisiae) homolog B 

ESTs 



AA524470 
AW207152 
NM 016940 



119908 RC_W85844 
103695 AA018758 
103978 AA307443 

109485 RC_AA233472 BE619092 

129574 AA458603 AA026815 

115347 RC.AA281528 AA356792 

120765 RC_AA338735 AW961026 
WARNING ENTRY [H.sapiens] 

121059 RC_AA398628 AA393283 

131887 AA046548 ■ W17064 
member 1 

112064 RC R43812 AL049390 

115606 RC_AA400465 AI025829 

131750 RC_H94855_s NM_004349 

102123 U14518 NMJ01809 

129847 RC_W46767 N64025 

133809 RC_AA235275 AV649326 

132210 RC_N51499_s NMJ07203 

122356 RC_AA443794 AA443794 

114958 RC_AA243708 N20912 

103951 AA287840 AL353944 

134703 RC_AA280704 AF1 17065 
128727 AA287864 AI223335 
105743 RC_AA293300_s BE246502 
domain, (semaphorin) 4B 

103744 AA076003 AA079267 
sequence 

114348 N80402 AL050321 

114009 RC_W90067 AI248544 

134704 RC_AA280849 AA837124 
128629 AA399187 AL096748 
104410 H65925 
110200 RCH21075 
124483 RC_N53976 
101391 M14648 
109657 RC.F04826 
117140 RC_H96813 
132937 RC_AA233706_f AW952912 
129799 R36410 AW967473 
105077 RC_AA142919 W55946 
100850 RC_N58561_s AA836472 
131043 RCJM90925 AF084535 
118417 RC_N66048_f AF080229 
129254 RC_AA243695 AA252468 
119149 RC_R58910 BE304701 
133996 AA091367 AA380267 
110223 RC_H23747 H19836 
117626 RC_N36090 AK001757 
135286 RC_AA424469_s AW023482 
122967 RC_AA478521 AA806187 
131236 AA282640 AF043117 

' H12912 



Hs.58753 
Hs.186600 
Hs.34136 
Hs.28465 
Hs.11463 
Hs.334824 
Hs.96752 



Hs.31551 

Hs.1594 

Hs.296178 

Hs.76359 

Hs.42322 

Hs.98390 

Hs.42369 

Hs.50115 

Hs.88764 

Hs.50651 

Hs.9598 



hypothetical protein MGC10947 

KIAA0328 protein 

hypothetical proteh 

ESTs 

ESTs 

chromosome 21 open reading frame 6 

Homo sapiens cDNA: FLJ21869 fis, clone HEP02442 

UMP-CMP kinase 

hypothetical protein FLJ14825 

ESTs, Weakly similar to ALU8_HUMAN ALU SUBFAMILY SX SEQUENCE CONTAMINATION 

gb:zt74e03.r1 Soares_testis_NHT Homo sapiens cDNA clone 5', mRNA sequence 
SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily e, 

Homo sapiens mRNA; cDNA DKFZp58601318 (from clone DKFZp58601318) 
ESTs 

core-binding factor, runt domain, alphasubunit2; translocated to, 1; cyclin D-related 
centromere protein A (17kD) 
hypothetical protein FLJ22637 



Homo sapiens mRNA; cDNA DKFZp761J1112 (from clone DKFZp761J1112) 
male-specific lethal-3 (Drosophila)-like 1 
Janus kinase 1 (a protein tyrosine kinase) 

sema domain, immunoglobulin domain (Ig), transmembrane domain (TM) and short cytoplasmic 
gb:zm97e10.s1 Stratagene colon HT29 (937221) Homo sapiens cDNA clone 3', mRNA 



AI807519 

H21075 

AI821780 

NMJ02210 

R60900 

H96813 



CRP2 binding protein 
KIAA0831 protein 
ESTs 

DKFZP434A043 protein 

Homo sapiens cDNA FLJ13694 fis, done PLACE2000115 
ESTs, Highly similar to A59235 unconventional myosin-15 [H.sapiens] 
ESTs 

integrin, alpha V (vitronectin receptor, alpha polypeptide, antigen CD51) 
ESTs 
ESTs 

hypothetical protein MGC3032 
mannosidase, alpha, class 1A, member 2 
Hs.234863 Homo sapiens cDNA FLJ 1 2082 fis, clone HEMBB1002492 
Hs.297939 cathepsin B 

Hs.22464 epilepsy, progressive myoclonus t/pe 2, Lafora disease (laforin) 

gb:Human endogenous retrovirus K clone 10.1 polymerase mRNA, partial cds 
DKFZp434 J 1813 protein 
ESTs 

DKFZP434F2021 protein 



Hs.102708 
Hs.104520 
Hs.31802 
Hs.179864 
Hs.295726 
Hs.26814 
Hs.42241 



ts.239114 



Hs.1098 
Hs.65732 
Hs.78277 



Hs.289101 
Hs.24594 
Hs.274691 



hypothetical protein FLJ 10895 
ESTs 

glucose regulated protein, 58kD 
ubiquitination factor E4B (homologous to ; 
adenylate kinase 3 
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112888 RC T03872 
115192 RCAA261920 
118688 RC_N71484 
122264 RC AA436837 
128981 AA135452 
131042 RC R42457 
103704 AA028171 
121341 AA233107 
106593 RC_AA456826 
115195 RC_AA262156 
115425 RCAA284071 
117258 RC.N21299 
120209 RC_Z40892 
sequence 
134082 L16991 
104774 RC_M026066 
115625 RC_M401630 
104469 N28707 
107401 W20054 
111686 RC_R21510 
115300 RC AA280026 
115378 RC M282292 
132224 RCH97819 
113791 M95767 
129144 AA004987 
104448 L44574 
132084 RC_T26981_s 
111831 RC R36083 
114765 RC AA252163 
115029 RC~AA252219 
100457 H81492 
104536 R24011 
PROTEIN 91 
116167 RC AA461562 
103889 AA236771 
131978 RC H48459_s 
118843 RCJI80181 
120837 RC W93092 
133647 D21852 
129521 U41815 
103746 AA081875 



AW195317 
AA741024 
AK000708 



AA927177 
AI825288 
AA028171 
AF035528 
AW296451 



Hs.86041 
Hs.171637 
Hs.151258 
Hs.1 53863 
Hs.24605 
Hs.1 55849 
Hs.1 80680 
Hs.42975 



hypothetical protein FU22344 
ESTs 

hypothetical protein FLJ20701 

gb:zv57g07.s1 Soares_testis_NHT Homo sapiens cDNA clone 3', mRNA sequence 
CGG triplet repeat binding protein 1 
hypothetical protein MGC2628 
hypothetical protein FLJ21062 

MAD (mothers against decapentaplegic, Drosophila) homolog 6 

ESTs 

ESTs 

ESTs, Weakly similar to 154374 gene NF2 protein [H.sapiens] 



gb:HSC1HB082i 



infant brain cDNA Homo sapiens cDNA clone c-1hb08 3', mRNA 



AA059459 

N28707 

N91453 

R22039 

AA280095 

AA282292 

N41549 

AI269096 

AL1 37275 

NMJ07331 

NM_002267 

R36095 



NM_015361 
AF071076 
AA075000 



Hs.135578 

Hs.20137 

Hs.1 10457 

Hs.3886 

Hs.268695 

Hs.337532 

Hs.40096 

Hs.285176 

Hs.158101 



Hs,221498 
Hs.306621 
Hs.268053 
Hs.1 12255 



132019 RC_AA134965_i H5S995 
132310 RC AA284107 AA173223 
117367 RC.N24954 AI041793 
103743 AA075998 AA075998 

gb:M15887 ACYL-COA-BINDING PROTEIN (HUMAN);, mRNA sequence 



Hs.37372 
Hs.289044 
Hs.42502 



deoxythymidylate kinase (thymidylate kinase) 

Homo sapiens cDNA FLJ12977 fe, clone NT2RP2006261 

ESTs 

Homo sapiens chromosome 19, BAC 282485 (CIT-B-344H19) 

ESTs 

ESTs 

ESTs 

hypothetical protein FLJ 10335 
ESTs 

chitobiase, di-N-acety!- 
hypothetical protein DKFZp434P0116 
Wolf-Hirschhorn syndrome candidate 1 
karyopherin alpha 3 (importin alpha 4) 
ESTs 

ESTs, Weakly similarto A47532 B-cell growth factor precursor [H.sapiens] 
ESTs 

acetyl-Coenzyme A transporter 

Homo sapiens cDNA FLJ14673 fis, clone NT2RP2003714, moderately similarto ZINC FINGER 

hypothetical protein FLJ20045 
ESTs 

KIAA0186gen9 product 
ESTs 

Homo sapiens cDNA FLJ1 1963 fis, clone HEMBB1001051 
KIAA0029 protein 
nucleoporin 98kD 

gb:zm83c07.s1 Stratagene ovarian cancer (937219) Homo sapiens cDNA clone 3', mRNA 

Homo sapiens DNA binding peptide mRNA, partial cds 
Homo sapiens cDNA FLJ12048 fis, clone HEMBB1001990 
ESTs 

gb:zm89b09.r1 Stratagene ovarian cancer (937219) Homo sapiens cDNA clone 5' similar to 



AA913909 
AA504428 
Al 187925 



130237 L39050 
128752 RC_N72879 
135162 AA045930 
131386 AA096412 BE219898 
129021 RCJW\599244 AL044675 
424274 AA293634 
129913 H06583 



W73933 



gb:nz79b10.s1 NCI_CGAP_GCB1 Homo sapiens cDNA clone 3' similarto gb:M34539 FK506- 

Hs.153088 TATA box binding protein (TBP)-associated factor, RNA polymerase I, A, 48kD 

Hs.10487 Homo sapiens, done IMAGE:3954132, mRNA, partial cds 

Hs.95667 F-box protein 30 

Hs.173135 dual-specificity tyrosine-(Y)-phosphorylation regulated kinase 2 

Hs.173081 KIAA0530 protein 

Hs.283738 casein kinase 1 , alpha 1 

Hs.13313 cAMP responsive element binding protein-like 2 

Hs.34054 Homo sapiens cDNA: FLJ22488 fis, clone HRC10948, highly similarto HSU79298 Human clone 

cleavage and polyadenylaticn specific factor 2, 100kD subunit 
NPD009 protein 

H.sapiens gene from PAC 106H8 
ESTs, Weakly similarto I54374 gene NF2 protein [H.sapiens] 
ESTs 

BE465093 Hs.106101 hypothetical protein FU22557 

gb:zr1 9c09.s1 Stratagene NT2 neuronal precursor 937230 Homo sapiens cDNA clone 



AB037788 
AW024973 
AL035301 



23803 mRNA 
118612 RC_N69466 
322026 AA203138 
110892 RC.N38882 
111429 RC R01245 
113334 RCJ76962 
104091 AA417310 

105246 RC_AA226879 AA226879 

IMAGE:663856 3' similarto contains Alu repetitive element;, mRNA sequence. 

113300 RCJ67448 T67448 Hs.13101 ESTs 

117147 RC H97225_s AW901347 Hs.38592 

121349 RC_AA405205 AA405205 Hs.97960 

100294 D49396 AA331881 Hs.75454 

133999 M28213 AA535244 Hs.78305 

133259 AA278548 BE379546 Hs.6904 

129423 AA371418 AA204686 Hs.234149 

131098 RC_AA459668 U66669 Hs.236642 

135272 AA399391 AI828337 Hs.97591 

129155 AA046865 AI952677 Hs.108972 



IFLJ23342 

ESTs, Weakly similar to T51 146 ring-box protein 1 [H.sapiens] 
peroxiredoxin 3 

RAB2, member RAS oncogene family 

Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 2004403 
hypothetical protein FLJ20647 
3-hydroxyisobutyryl-Coenzyme A hydrolase 
ESTs 

Homo sapiens mRNA; cDNA DKFZp434P228 (from clone DKFZp434P228) 
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311291 
120750 
101002 
133012 
5 103879 
131281 
115109 
118502 
134100 

10 131869 
115396 
103860 
135089 
129938 

15 107508 



123009 
131004 
103317 
132814 



110754 
132727 
100341 
134664 
103826 
111678 
101341 
115455 
111192 
129385 



Hs.319817 ESTs 



AA099241 

AA228148_s 

RC_M443212 

RC_AA256383 

RCLN67317 

L07540 

AA484944 



AA203742 

N75611_s 

U79300 

W90095 

AA005190 

AA203147 

RC_AA504125. 

AA477046 

RC_AA479949 

D29833 

X83441 

RC_C15251_f 

U77718 

X59710 

RC_N20814 

AA136382_s 

D63506 

AA256106 

AA1 65564 

RC.R2C628 

L76159 



AA251716 
AJ249977 
AL157488 
AA460085 
AW968547 
AA810854 
AW976877 
AI918035 
AW003668 
N74925 
AA1 58008 
AL020996 
s AW969025 
AI936442 
AA535244 
D29833 
X83441 
D60730 
BE018142 
AL044818 
AW302200 
N27495 
AF032922 
AA256106 
AW162998 
R38487 
NM_004477 



Hs.169919 
Hs.62711 
Hs.50252 
Hs.25227 
Hs.88049 
Hs.50150 
Hs.171075 
Hs.33540 
Hs.89081 



122105 
121324 
120938 
115001 



RC_M477748 

RC_AA235604 AA172106 

RC_T79951 AW970209 

RC_AA432278 AW241685 

RC_AA404229 AA404229 

RC_AA386260 AA386260 

RC_AA251376 AA251376 



Hs.292444 

Hs.8518 

Hs.109154 

Hs.59838 

Hs.78305 

Hs.2207 

Hs.166091 

Hs.57471 

Hs.300954 



Hs.87507 
Hs.24684 
Hs.169927 
Hs.203772 
Hs.120551 
Hs.109438 
Hs.110950 
Hs.111805 
Hs.98699 
Hs.97842 
Hs.104632 



124799 RC_R45088 R45088 



AA457395 

N48325 

AA427396 



ESTs, Moderately similar to 2109260A B cell growth factor [H.sapiens] 
electron-transfer-flavoprotein, alpha polypeptide (glutario aciduria II) 
Homo sapiens, clone IMAGE;3351295, mRNA 
mitochondrial ribosomal protein L32 
ESTs 

protein kinase, AMP-activated, gamma 3 non-catalytic submit 

Homo sapiens mRNA; cDNA DKFZp564B182 (from clone DKFZp564B182) 

replication factor C (activator 1) 5 (36,5kD) 

ESTs, Weakly similar to dJ309K20.4 [H.sapiens] 

ESTs 

ESTs 

roundabout (axon guidance receptor, Drosophila) homolog 1 

Human clone 23529 mRNA sequence 

Homo sapiens cDNA: FLJ21564 fis, clone COL05452 

ESTs 

selenoprotein N 
ESTs 

hypothetical protein FLJ10808 
RAB2, member RAS oncogene family 
salivary proline-rich protein 
ligase IV, DNA, ATP-dependent 
ESTs 

Huntingtin interacting protein K 
nuclear transcription factor Y, beta 
KIAA0672 gene product 
hypothetical protein FLJ22626 
syntaxin binding protein 3 
ESTs 

KIAA1376 protein 
ESTs 

FSHD region gene 1 
toll-like receptor 10 

Homo sapiens clone 24775 mRNA sequence 



gb:zs10a06.s1 NCI_CGAP_GCB1 Homo sapiens cDNA clone IMAG&684754 3', mRNA 
gb:yg38g04.s1 Soares infant brain 1NIB Homo sapiens cDNA clone IMAGE:34896 3', mRNA 



122724 RCLAA457395 
117791 RC_N48325 
121895 RC_AA427396 



Hs.99457 ESTs 



gb:zv/33a02.s1 Soares ovary tumor NbHOT Homo sapiens cDNA clone IMAGE771050 3' 



similar to contains Alu repetitive element;contains MER12.t2 MER12 repetitive element ;, mRNA sequence. 



gb:zm05c09.s1 Stratagene corneal stroma (937222) Homo sapiens cDNA clone IMAGE:513232 

AW877787 Hs.136102 KIAA0853 protein 

R77854 . Hs.250693 Krueppel-related zinc finger protein 

AA447400 Hs.187684 ESTs, Weakly similar to B34087 hypothetical protein [H.sapiens] 

N74625 gb:za55c03.s1 Soares fetal liverspleen 1NFLS Homo sapiens cDNA clone IMAGE:296452 3' 



108244 RC_AA062839 
3', mRNA sequence. 

117852 RG_N49408 

109298 RC_AA205432 

122432 RC_AA447400 

124627 RC_N74625 

similar to gb:M14338 VITAMIN K-DEPENDENT PROTEIN S PRECURSOR (HUMAN);contains OFR.b3 OFR repetitive element ;, mRNA sequence. 

' 115141 RC_AA258071 AA465131 Hs.64001 Homo sapiens clone 25218 mRNA sequence 

128636 U49065 U49065 Hs.102865 interleukin 1 receptor-like 2 

115373 RC_AA282197 AA664862 Hs.181022 CGI-07 protein 

114651 RC_AA101400 AA101400 Hs.189960 ESTs 

132795 RC_AA180487 NM_C06283 Hs.173159 transforming, acidic coiled-coil containing protein 1 

103749 RCN35583 AL135301 Hs.8768 hypothetical protein FLJ10849 

107328 T83444 AW959891 Hs.76591 KIAA0887 protein 

115349 RC_AA281563 AF121176 Hs.12797 DEAD/H (Asp-GkfcAla-Asp/His) box polypeptide 16 

111490 RC_R06862 R06862 gb:yf11e09.s1 Soares fetal liverspleen 1NFLS Homosapiens cDNA clone IMAGE:126568 3' 



similar to contains L1 repetitive element ;, mRNA sequence. 



AA085291 

contains Alu repetitive element;, mRNA sequence 

118791 RCJ175520 N75520 Hs.261003 

116644 RCLF03032 F03032 Hs.290278 

116823 RC_H56485 AW204742 Hs.143542 
[H.sapiens] 

108940 RC_AA148603 AA148603 
IMAGE;567198 3', mRNA sequence. 

112218 RC_R50057 R50057 Hs.272251 

116557 RC_D20572_i D20572 Hs.90171 

133649 U25849 U25849 Hs.75393 

131745 RC_C20746 AI828559 Hs.31447 



gb:zn01g03.s1 Stratagene colon HT29 (937221) Homo sapiens cDNA clone 3' similar to 

ESTs, Moderately similar to B34087 hypothetical protein [H.sapiens] 
ESTs, Weakly similar to B34087 hypothetical protein [H.sapiens] 

ESTs, Highly similar to CSA_HUMAN COCKAYNE SYNDROME WD-REPEAT PROTEIN CSA 
gb:zo09e04.s1 Stratagene neuroepithelium NT2RAMI 937234 Homo sapiens cDNA clone 
Homo sapiens mRNA; cDNA DKFZp585M1418 (from clone DKFZp586M1418) 
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116801 


RC_H43879 


H43879 




sequenc 
115006 


RC AA251548 


AA251548 


Hs.87886 


123424 


RC AA598500 


H29882 


Hs.162614 


120831 


RC M347919 


AA347919 


Hs.95889 


103691 


AA018298 


AA018298 


Hs.103332 


121555 


RC AA412491 


AF025771 


Hs.50123 


111193 


RC N67946 


N67946 


Hs.1 17569 


132061 


RC AA058946 


AB0207C0 


Hs.3830 


134575 


RC_AA194568_i 


AA1 94568 


Hs.85938 


115050 


RC_AA252794 


AA252794 


Hs.88009 


420208 


U31799 


BE276055 


Hs.95972 


133735 


AC002045_xpt1 


R66740 


Hs.110613 


128546 


Z21305 


NMJ03478 


Hs.101299 


111946 


RC R40697 


R40697 


Hs.76666 


124879 


RC R73588 


R73588 


Hs.101533 


115683 


AA410345 


AF255910 


Hs.54650 


103692 


AA018418 


AW137912 


Hs.227583 


(CACNA1F)gene,comple 


te cds; HSP27 pseudogene, o 


103767 


AA089688 


BE244667 


Hs.296155 


125266 


W90O22 


W90022 


Hs.1 86809 


PRECUI 


^SOR [H.sapiens] 






135235 


AA435512 


AW298244 


Hs.293507 


134497 


RC_AA404494 


BE258532 


Hs.251871 


426754 


RCJ\A278529_i NMJM264 


Hs.172052 


412177 


RC_AA342828_s Z23091 


Hs.73734 


132000 


RC_AA044644 


AW247017 


Hs.36978 


124738 


RC AA044644 


T07568 


Hs.137158 


324000 


RC_AA196729_i 


AA604749 


Hs.190213 


106896 


RC AA196729J 


AW073202 


Hs.334825 


132000 


RC AA025858 


AW247017 


Hs.36978 


129577 


RC_AA025858 


N75346 


Hs.82906 


107091 


RC_AA233519 


AI949109 


Hs.246885 


130296 


RC.N52271 


D31139 


Hs.154103 


102855 


RC N68399 


NM 003528 


Hs.2178 


113689 


RC AA098874 


AB037850 


Hs.16621 


100939 


RC AA279667 s 


! L04288 


Hs.297939 


130430 


RC H22556 


W27893 


Hs.150580 


106734 


RC N45979 s 


BE296690 


Hs.288173 


intersec 


tin 2 long isoform (ITSN2) mRNA 




135148 


RC AA431288 £ 


i AA306478 


Hs.95327 


134221 


RCJW609862 


BE280456 


Hs.80248 


105376 


RC N35583 


AW994032 


Hs.8768 


124541 


U77718 


AF1 12222 


Hs.44499 


134546 


AA203147 


AL020996 


Hs.8518 


134000 


RC W93092 


AW175787 


Hs.334841 


125656 


RC W93092 


AW516428 


Hs.78687 


100939 


RC N58561_s 


L04288 


Hs.297939 


125656 


RC W93092 


AW516428 


Hs.78687 


101779 


RC W69385 s 


BE543412 


Hs.250505 


332489 


RC R22947 


R23053 


NA 


133000 


RC N38959 f 


AL042444 


Hs.62402 


125905 


RC N38959J 


AI678638 


Hs.6455 


129000 


RC H73050 s 


AA744902 


Hs.107767 


100920 


RC_H73050_s 


X54534 


Hs.278994 



gb:yo69h09.s1 Scares breast 3NbHBst Homo sapiens cDNA clone IMAGE:1f 



zinc finger protein 189 



junctional adhesion molecule 2 



glycoprotein V (platelet) 
melanoma antigen, family A, 3 
ESTs 
ESTs 

Homo sapiens cDNA FLJ14752 fis, clone NT2RP3003071 

melanoma antigen, family A, 3 

CDC20 (cell division cycle 20, S. cerevisiae, homolog) 

hypothetical protein FU20783 

LIM protein (similar to rat protein kinase C-binding enigma) 

H2B histone family, member Q 

DKFZP434I116 protein 

cathepsin B 

putative translation initiation factor 

Homo sapiens cDNA: FU21747 fis, clone COLF5160, highly similar to AF182198 Homo sapiens 

CD3D antigen, delta polypeptide (TiT3 complex) 
RNA-binding protein gene with multiple splicing 
hypothetical protein FU10849 
pinin, desmosome associated protein 



neutral sphingomyelinase (N-SMase) activation associated factor 

retinoic acid receptor, alpha 

HuOIChipRedos 

p21/Cdc42/Rac1-activated kinase 1 (yeast Ste20-related) 
chaperonin containing TCP1 , subunit 2 (beta) 
hypothetical protein PR01489 
Rhesus blood group, CcEe antigens 
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TABLE 1A 

Table 1 A shows the accession numbers for those pkeys lacking unigenelD's for Tables 1 . The pkeys in Table 7 lacking unigenelD's are represented within 
Tables 1-6A. For each probeset we have listed the gene cluster number from which the oligonucleotides were designed. Gene clusters were compiled 
using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 
Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey CAT Number Accession 



108469 116761J 
124106 125446J 
108501 13684.-12 
108562 36375J 
125008 
125020 
125066 
116661 
125104 
124575 
125263 
116845 
118417 



116017J 

1814993J 

1532859J 

413347J 

1666649J 

1547J 

393481J 

37186J 



AA100795 AF020589 M074629 AA075946 M100849 M085347 AA126309 AA07931 1 AA079323 AA085274 

T91251 T64891 T85665 

T69981T69924AA078476 



R61504F04247 
T95590AA703278H62764 
N66168 N69188 N9045Q 



118584 
103743 
103744 
103746 



112194J 
114161J 
113452J 
114208J 
48290_6 
1531817_1 
158963J 
1605263J 



112540 
111904 

121059 273450J AA393283 AA398628 
121094 275729J AA402505 AA398900 
114106 



AA649530AA659316 H64973 

AF080229 AF080231 AF080230 AF080232 AF080233 AF080234 BE550633 AI636743 AW614951 BE467547 AI680833 
AI633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 AI970376 AI583718 AI672574 
N25695 AW665466 AI818326 AA126128 AI480345 AW013827 AA248638 AI214968 AA204735 AA207155 AA206262 AA204833 
AW003247 AW496808 A1080480 AI631703 A1651023 AI867418 AWB18140 AA502500 AI206199 AI671282 AI352545 BE501030 
AI652535 BE465762 AA206331 AW451866 M471088 AA206342 AA204834 AA206100 AW021661 AA332922 N66048 
AA703396 H92278 AW139734 H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 BE46661 1 AI206344 AA574397 
AA348354AI493192 

AW136928 AI685655 BE218584 BE465078 N68963 M975338 BE147199 N76377 
AA075998 AA075999 AA070986 AA070896 AA 129207 AA078942 AA070783 AA078941 
AA079267AA076003 
AA075000AA081876 

AA765163 AW298222 AA126126 AA0851 38 AA076068 
AA085291 AA085354 
F02951 Z40892F04711 
AA1 79656 AA1 82626 AA1 82603 
R69751 R70467 H69771 H80879 H80878 
Z41572 R39330 



122264 



130529 
108309 
107832 
123731 
116571 
132225 
125017 
125053 



125118 
102269 
125150 
116801 
118111 



W88999 

AA436837AA442594 
AA065069AA085108 
R23053 R79884 R76271 
AA1 78953 AA1 92740 



1182096J 
2396L-3 
296527J 
110682J 
1706092J 
158447J 
111495J 
genbank_AA021473 AA021473 
genbank_AA609839 AA609839 
genbank_D45652 D45652 
genbank_AA1 28980 AA1 28980 
genbank_T68875 T68875 
genbank_T85352 T85352 
genbank_T85373 T85373 
entrez_J00212J00212 
149288J R10606T97620M576309 
entrez_U30245U30245 

NOT_FOUND_entrez_W38240 W38240 



genbank_H43879 
genbank_N55493 
genbank_N57493 



H43879 
N55493 
N57493 



118475 genbank_N66845 

111490 genbank_R06862 

111514 genbank_R07998 

104534 R22303_at R22303 

120340 genbank_AA206828 
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120376 
104787 
120409 
120745 



113702 
115001 
122562 
122635 
108244 
108277 
122723 
124028 
108403 



117031 
124254 
101447 
101458 
124577 



124627 
124720 
124793 
124799 
117583 
117732 
124991 
119023 



genbank_AA227469 

genbank_AA027317 

genbank_M235050 

genbankJ\A302809 

genbank_M346495 

genbank_AA348913 

genbank_T97307 

genbank_AA251376 

genbank_AA452156 

genbank_AA454085 

genbank_AA0S2839 

genbank_AA0S4859 

genbank_AA457380 

genbank_F04112 

genbank_M075374 

genbank_M464414 

genbank_AA076382 

genbank_AA078985 

231290J 



genbank_AA084415 

genbank_H88353 

genbank_H69899 

entrez M21305 

entrez_M22092 

genbank_N68300 

genbank_AA148603 

genbank_AA148650 

genbank_N74625 

144582J 

genbank_R44519 



genbank_N40180 
genbank_N46452 
genbank_T50116 
ggnbank_N98488 
95573_2 



AA227469 

AA027317 

AA235050 

M302809 

AA346495 

AA348913 

T97307 

AA251376 

AA452156 

AA4540B5 



AA457380 
F04112 
AA075374 
AA464414 



AW411259 H23555 AW015049 AIS84275 AW015886 AW068953 AW014085 AI027260 R52686 AA918278 AI129462 

N34869 AI948416 AA534205 AA702483 AA705232 

M084415 

H88353 



119558 NOT_FOUND_entrez_W38194 



AA148650 
N74625 

R05283 R11056 

R44519 

R45088 

N40180 

N46452 

T50116 

N98488 

T1 1483 T1 1472 



119654 
105246 
121350 
121558 
105985 
100071 
114648 
121895 



123315 714071J AA496369 AA496646 



genbank_W57759 
genbank_AA22687S 
genbank_M405237 
genbanLAA412497 
genbank_AA406610 
entrez_A28102A28102 
genbankJ\A101056 
genbank_AA427396 



W57759 
AA226879 
M405237 
AA412497 
AA406610 
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TABLE 2: 






Pkey: 


Unique Eos probeset identifier number 


Accession: 


Accession number used for previous patent filings 


ExAccn: 


Exemplar Accession number, Genbank accession number 


UnigenelD: 


Unigene number 






Unigene Title: 


Unigene gene titli 


5 




Pkey Accession ExAccn 


UnigenelD 


UnigeneTitle 


100420 100420 


D86983 


Hs.1 18893 


Melanoma associated gene 


100484 100484 


NM_005402 


Hs.288757 


v-ral simian leukemia viral oncogene horn 


100991 100991 


J03836 


Hs.82085 


serine (or cysteine) proteinase inhiblto 


101168 101168 


NM_005308 


Hs.211569 


G protein-coupled receptor kinase 5 


101261 101261 


D30857 




protein C receptor, endothelial (EPCR) 


101447 101447 


M21305 




gb:Human alpha satellite and satellite 3 


101543 101543 


M31166 


Hs.2050 


pentaxin-related gene, rapidly induced b 


101560 101560 


AW958272 


Hs.347326 


intercellular adhesion molecule 2 


101714 101714 


M68874 


Hs,211587 


phospholipase A2, group IVA (cytosolic, 


101838 101838 


BE243845 


Hs.75511 


connective tissue growth factor 


102012 102012 


BE259035 


Hs.1 18400 


singed (Drosophila)-like (sea urchin fas 


102164 102164 


NM_000107 


Hs.77602 


damage-specific DNA binding protein 2 (4 


102283 102283 


AW161552 


Hs.83381 


guanine nucleotide binding protein 11 


102564 102564 


U59423 


Hs.79067 


MAD (mothers against decapentaplegic, Dr 


102759 102759 


NM_005100 


Hs.788 


A kinase (PRKA) anchor protein (gravin) 


102804 102804 


NM 002318 


Hs,83354 


lysyl oxidase-like 2 


102898 102898 


NM 002205 


Hs,149609 


integrin, alpha 5 (fibronectin receptor, 


103036 103035 


M13509 


Hs.83169 


matrix metallcproteinase 1 (interstitial 


103095 103095 


NMJ05424 


Hs.78824 


tyrosine kinase with immunoglobulin and 


103166 103166 


AA1 59248 


Hs.180909 


peroxiredoxin 1 


103280 103280 


U84722 


Hs.76206 


cadherin 5, type 2, VE-cadherin (vascula 


103850 103850 


AA187101 


Hs.213194 


hypothetical protein MGC10895 


104592 104592 


AW630488 


Hs.25338 


protease, serine, 23 


104786 104786 


AA027167 


Hs.10031 


KIAA0955 protein 


104865 104865 


T79340 


Hs.22575 


B-cell CLUIymphoma 6, member B (zinc fi 


104952 104952 


AW076098 


Hs.345588 


desmoplakin (DPI, DPII) 


105178 105178 


AA313825 


Hs.21941 


AD036 protein 


105330 105330 


AW338625 


Hs.22120 




105729 105729 


H46612 


Hs.293815 


Homo sapiens HSPC285 mRNA, partial cds 


105977 105977 


AK001972 


Hs.30822 


hypothetical protein FLJ11110 


106031 106031 


X64116 


Hs.1 71 844 


Homo sapiens cDNA: FLJ22296 fis, clone H 


106155 106155 


AA425414 


Hs.33287 


nuclear factor l/B 


106423 106423 


AB020722 


Hs.16714 


Rho guanine exchange factor (GEF) 15 


107174 107174 


BE1 22762 


Hs.25338 


ESTs 


107295 107295 


AA1 86629 


Hs.80120 


UDP-N-acetyl-alpha-D-galaotosamine:polyp 


108756 108756 
108888 108888 


AA1 27221 


Hs.1 17037 


ESTs 


AA1 35606 


Hs.189384 


gb:zl10a05.s1 Soares_pregnant_uterus_NbH 


109166 109166 


AA219691 


Hs.73625 


RAB6 interacting, kinesin-like (rabkines 


109768 109768 


F06838 


Hs.14763 


ESTs 


110906 110906 


AA035211 


Hs.17404 


ESTs 


111006 111006 


BE387014 


Hs.1 66 146 


Homer, neuronal immediate early gene, 3 


111133 111133 


AW580939 


Hs.97199 


complement component C1q receptor 


113073 113073 


N39342 


Hs.103042 


microtubule-associated protein 1B 


113923 113923 


AW953484 


Hs.3849 


hypothetical protein FLJ22041 similar to 


115061 115061 


AI751438 


Hs.41271 


Homo sapiens mRNA full length insert cDN 


115145 115145 


AA740907 


Hs.88297 


ESTs 


115947 115947 


R47479 


Hs.94761 


KIAA1691 protein 


116339 116339 


AK000290 


Hs.44033 


dipeptidyl peptidase 8 


116589 116589 


AI557212 


Hs.17132 


ESTs, Moderately similar to I54374 gene 


117023 117023 


AW070211 


Hs.1 02415 


Homo sapiens mRNA; cDNA DKFZp586N0121 (f 


117563 117563 


AF055634 


Hsi44553 


unc5(C.eleganshomolog)c 


118475 118475 


N65845 




gb:za46c11.s1 Soares fetal liver spleen 


119073 119073 


BE245360 


Hs.279477 


ESTs 


119174 119174 


R71234 




gb:yi54c08.s1 Soares placenta Nb2HP Homo 


119416 119416 


T97186 




gb:ye50h09.s1 Soares fetal liver spleen 


121335 121335 


AA404418 




gb:zw37e02.s1 Soaresjotaljetus Nb2HF8_ 


123160 123160 


AA488687 


Hs.284235 


ESTs, Weakly similar to I38022 hypotheti 


123523 123523 


AA608588 




gb;ae54e06.s1 Stratagene lung carcinoma 


123964 123964 


C13961 




gb:C13961 Clontech human aorta polyA+mR 


124315 124315 


NM_005402 


Hs.288757 


v-ral simian leukemia viral oncogene horn 


124669 124669 


AI571594 


Hs.102943 


hypothetical protein MGC12916 


124875 124875 


AI887664 


Hs.285814 


sprouty (Drosophila) homolog 4 


125103 125103 


AA570055 


Hs.122730 


ESTs, Moderately similarto KIAA1215 pro 


125565 125565 


R20840 




gb:yg05c08.r1 Soares infant brain 1NIB H 
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Hs.57958 


EGF-TM7-latrophilin-related protein 




^6649 


AA001860 


Hs.279531 




449602 




AA001860 


Hs.279531 








AA358869 


Hs.227949 


SEC13 (S. cerevisiae)-l:ke 1 






H04150 


Hs. 107708 




129188 


129188 


NM_001078 


Hs. 109225 


vascular cell adhesion molecule 1 


129371 


129371 


X06828 


Hs. 110802 


von Willebrand factor 


129765 


129765 


M86933 




amelogenin (Y chromosome) 


129884 


129884 








130639 


130639 


AI557212 


Hs.17132 


ESTs, Moderately similar to I54374 gene 


130828 


130828 


AW631469 






131080 


131080 


NM_001955 




endothelin 1 


131182 131182 


AIB24144 


Hs.23912 




131573 


131573 


AA040311 






131756 


131756 


AA443966 


Hs.31595 




131881 


131881 


AW361018 




upstream regulatory element binding prot 


132083 


132083 


BE386490 






132358 


132358 


NM_003542 


Hs.46423 


H4 histone family, member G 


132456 


132456 


ABO 11084 


Hs.48924 


KIAA0512 gene product; ALEX2 


132676 


132676 


N92589 


Hs.261038 


ESTs, Weakly similar to I38022 hypotheti 


132718 


132718 


NM_004600 




Sjogren syndrome antigen A2 (60kD, ribon 


132760 


132760 


AA1 25985 


Hs.56145 


thymosin, beta, identified in neuroblast 


132968 


132968 


AF234532 


Hs.61638 


myosin X 


133061 


133061 


AI186431 


Hs.296638 


prostate differentiation factor 


133161 


133161 


AW021103 




hypothetical protein FLJ20373 


133260 


133260 


AA403045 


Hs.6906 


Homo sapiens cDNA: FLJ23197 fis, clone R 


133491 


133491 


BE619053 


Hs. 170001 


eukaryotic translation initiation factor 


133550 


133550 


AI129903 


Hs.74669 


vesicle-associated membrane protein 5 (m 


133614 


133614 


NM_003003 


Hs.75232 


SEC14 (S. cerevisiae)-like 1 


133691 


133691 


M85289 


Hs.211573 


heparan sulfate proteoglycan 2 (perlecan 


133913 


133913 


AU076964 




calumenin 


133985 


133985 




Hs.78146 


platelet/endothelial cell adhesion molec 


134088 


134088 


AI379954 


Hs.79025 




134299 


134299 


AW580939 


Hs.97199 


complement component C1q receptor 


116470 


116470 


AI272141 






134989 


134989 


AW968058 




nudix (nucleoside diphosphate linked moi 


135073 


135073 




Hs.94030 


Homo sapiens mRNA; cDNA DKFZp586E1624 (f 


100114 


100114 


X02308 


Hs.82962 


thymidylate synthetase 


100143 100143 


AU076465 


Hs.278441 


KIAA0015gene product 


100208 100208 


NM_002933 




ribonuclease, RNase A family, 1 (pancrea 


100405 100405 


AW291587 


Hs.82733 


nidogen2 


100455 


100455 


AW888941 


Hs.75789 


N-myc downstream regulated 


100618 


100618 


AI752163 


Hs.1 14599 


collagen, type VIII, alpha 1 


100658 


100658 


U56725 


Hs.180414 


heat shock 70kD protein 2 


100718 


100718 


BE295928 




Inhibitor of DNA binding 1, dominant neg 


100828 


100828 


AL048753 


Hs.303649 


small inducible cytokine A2 (monocyte ch 


100991 


100991 


J03836 


Hs.82085 


serine (or cysteine) proteinase inhibito 


101110 


101110 


AI439011 


Hs.86386 




101156 


101156 


AA340987 


Hs.75693 


procarboxypeptidase (angiotensinase C 


101184 


101184 


NM_001674 






101317 


101317 




Hs.8302 


four and a half LIM comams 2 


101345 


101345 


NM_005795 


Hs.152175 


calcitonin receptor-like 


101475 


101475 


BE410405 


Hs.76288 


calpain 2, (rn/ll) large subunit 


101496 


101496 


X12784 


Hs.1 19129 


collagen, type IV, alpha 1 


101543 


101543 


M31166 


Hs.2050 


pentaxin-related gene, rapidly induced b 


101560 
101592 


101560 


AW958272 


Hs.347326 


intercellular adhesion molecule 2 


101592 


AF064853 


Hs.91299 


guanine nucleotide binding protein (G pr 


101634 


101634 


AV650262 


Hs.75765 


GR02 oncogene 


101682 


101682 


AF043045 


Hs.81008 


filamin B, beta (actin-binding protein-2 


101720 


101720 




Hs.81328 


nuclear factor of kappa light polypeptid 


101744 


101744 


AI879352 


Hs.1 18625 


hexokinase 1 


101837 101837 






zinc finger protein homologous to Zfp-36 


101840 101840 


AA236291 


Hs.183583 


serine (or cysteine) proteinase inhibito 


101864 101864 


BE392588 






101966 


101966 




Hs.76095 


immediate early response 3 


102013 


102013 


BE616287 




catenin (cadherin-associated protein), a 


102059 


102059 


AI752666 


Hs.76669 


nicotinamide N-methyltransferase 


102283 


102283 


AW161552 


Hs.83381 


guanine nucleotide binding protein 11 


102378 


102378 


AU076887 


Hs.28491 


spermidine/spermine N 1 -acetyltransferase 


102460 


102460 


U48959 


Hs.211582 


myosin, light polypeptide kinase 


102499 


102499 


BE243877 


Hs.76941 


ATPase, Na^K+transporting, beta 3 poly 


102560 


102560 


R97457 


Hs.63984 


cadherin 13, H-cadherin (heart) 


102589 


102589 


AU076728 


Hs.8867 


cysteine-rich, angiogenic inducer, 61 


102645 


102645 


AL1 19566 


Hs.6721 


lysosomal 


102693 


102693 


AA532780 


Hs.183684 


eukaryotic translation initiation factor 


102759 


102759 


NM_005100 


Hs.788 


A kinase (PRKA) anchor protein (gravin) 
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102882 102882 


AI767736 


Hs.290070 


gelsolin (amyloidosis, Rnnish lype) 


102915 102915 


X07820. 


Hs.2258 


matrix metalloproteinase 10 (strorr.eiysin 


102960 102960 


AI904738 


Hs.76053 


DEADiH (Asp-Glu-Ala-Asp/His) box polypep 


103020 103020 


X53416 


Hs.195464 


filamin A, alpha (actin-binding prote'm- 


103036 103036 


M13509 


Hs.83169 


matrix metalloproteinase 1 (interstitial 


103080 103080 


AU077231 


Hs.82932 


cyolin D1 (PRAD1: parathyroid adenomatos 


103138 103138 


X65965 




gb:H.sapiens SOD-2 gene for manganese su 


103195 103195 


AA351647 


Hs.2642 


eukaryotic translation elongation factor 


103371 103371 


X91247 


Hs.13046 


thioredoxin reductase 1 


103471 103471 


Y00815 


Hs.75216 


protein tyrosine phosphatase, receptor t 


104447 104447 


AW204145 


Hs.156044 


ESTs 


104783 104783 


AA533513 


Hs.93659 


protein disulfide isomerase related prat 


104865 104865 


T79340 


Hs.22575 


B-cell CLL/lymphoma 6, member B (zinc fi 


104894 104894 


AF065214 


Hs.18858 


phospholipaseA2, group IVC (oytosolio, 


105113 105113 


AB037816 


Hs.8982 


Homo sapiens, clone 1MAGE:3506202, mRNA, 


105196 105196 


W84893 


Hs.9305 


angiotensin receptor-like 1 


105263 105263 


AW388633 


Hs.6682 


solute carrierfamily 7, (oationic amino 


105330 105330 


AW338625 


Hs.22120 


ESTs 


105492 105492 


AI805717 


Hs.289112 


CGI43 protein 


105594 105594 


AB024334 


Hs.25001 




105732 105732 


AW504170 


Hs.274344 


hypothetical protein MGC12942 


105882 105882 


W46802 


Hs.81988 


disabled (Drosophila) homolog 2 (mitogen 


106031 106031 


X64116 


Hs.171844 


Homo sapiens cDNA: FLJ22296 fis, clone H 


106222 106222 


AA355392 


Hs.21321 


Homo sapiens clone FLB9213 PR02474 mRNA, 


106263 106263 


W21493 


Hs.28329 


hypothetical protein FLJ14005 


106366 106366 


AA186715 


Hs.336429 


RIKEN cDNA 9130422N19 gene 


106634 106634 


W25491 


Hs.288909 


hypothetical protein FLJ22471 


106793 106793 


H94997 


Hs.16450 


ESTs 


106842 106842 


AF1 24251 


Hs.26054 


novel SH2-containing protein 3 


106890 106890 


AA489245 


Hs.88500 


mitogen-activated protein kinase 8 inter 


106974 106974 


AI817130 


Hs.9195 


Homo sapiens cDNA FLJ13698 fis, clone PL 


107061 107061 


BE147611 


Hs.6354 


stromal cell derived factor receptor 1 


107216 107216 


□51069 


Hs.211579 


melanoma cell adhesion molecule 


107444 107444 


W28391 


Hs.343258 


proliferation-associated 2G4, 38kD 


108507 108507 


AI554545 


Hs.68301 


ESTs 


108931 108931 


AA147186 




gb:zo38d01.s1 Stratagene endothelial eel 


109195 109195 


AF047033 


Hs.132904 


solute carrier family 4, sodium bicarbon 


109455 109456 


AW956580 


Hs.42699 


ESTs 


110411 110411 


AW001579 


Hs.9645 


Homo sapiens mRNA for KIAA1741 protein, 


110905 110906 


AA035211 


Hs.17404 


ESTs 


111091 111091 


AA300067 


Hs.33032 


hypothetical protein DKFZp434N185 


111378 111378 


AW1 60993 


Hs.326292 


hypothetical gene DKFZp434A1 1 14 


111769 111769 


AW629414 


Hs.24230 


ESTs 


112951 112951 


AA307634 


Hs.6650 


vacuolar protein sorting 45B (yeast homo 


113195 113195 


H83265 


Hs.8881 


ESTs, Weakly similar to S41044 chromosom 


113542 113542 


H43374 


Hs.7890 


Homo sapiens mRNA forKIAA1671 protein, 


113847 113847 


NM 005032 


Hs.4114 


p!asiin3(Tisoform) 


113947 113947 


W84768 




gb:zh53d03.s1 Soares_fetalJiverj3pleen_ 


115061 115061 


AI751438 


Hs.41271 


Homo sapiens mRNA full length insert cDN 


" 115870 115870 


NM 005985 


Hs.48029 


snail 1 (drosophila homolog), zinc tinge 


116228 116228 


AI767947 


Hs.50841 


ESTs 


116314 116314 


AI799104 


Hs.178705 


Homo sapiens cDNA FLJ1 1 333 fis, clone PL 


117023 117023 


AW070211 


Hs.102415 


Homo sapiens mRNA; cDNA DKFZp586N0121 (f 


117156 117156 


W73853 




ESTs 


117280 117280 


M18217 


Hs.172129 


Homo sapiens cDNA; FLJ21409 fis, clone C 


119866 119856 


AA496205 


Hs.193700 


Homo sapiens mRNA; cDNA DKFZp586l0324 (f 


121314 121314 


W07343 


Hs.182538 


phospholipid scramblase 4 


121822 121822 


AI743860 




metallothionein 1E (functional) 


122331 122331 


AL1 33437 


Hs.110771 


Homo sapiens cDNA; FLJ21904 fis, clone H 


123160 123160 


AA488687 


Hs.284235 


ESTs, Weakly similar to I38022 hypotheti 


124059 124059 


BE387335 


Hs.283713 


ESTs, Weakly similar to S64054 hypotheti 


124358 124358 


AW070211 


Hs.102415 


Homo sapiens mRNA; cDNA DKFZp586N0121 (f 


124726 124726 


NM 003654 


Hs.104576 


carbohydrate (keratan sulfate Gal-6) sul 


125167 125167 


AL1 37540 


Hs.102541 


netrin 4 


125307 125307 


AW580945 


Hs.330466 


ESTs 


107985 107985 


T40064 


Hs.71968 


Homo sapiens mRNA; cDNA DKFZp564F053 (fr 


125598 125598 


T40064 


Hs.71968 


Homo sapiens mRNA; cDNA DKFZp564F053 (fr 


413731 413731 


BE243845 


Hs.75511 


connective tissue growth factor 


116024 116024 


AA088767 




transmembrane, prostate androgen induced 


418000 418000 


AA932794 


Hs.83147 


guanine nucleotide binding protein-like 


126399 126399 


AA088767 


Hs.83883 


transmembrane, prostate androgen induced 


127566 127566 


AI051390 


Hs.1 16731 


ESTs 


128453 128453 


X02761 


Hs.287820 


fibronecfjn 1 


128515 128515 


BE395085 


Hs.10086 


type I transmembrane protein Fn14 


128623 128623 


BE076608 


Hs.105509 


CTL2gene 


128669 128669 


W28493 


Hs.180414 


heat shock 70kD protein 8 
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128914 128914 


AW867491 


Hs.107125 


plasmalemma vesicle associated protein 


129188 129188 


NM_001078 


Hs.109225 


vascular cell adhesion molecule 1 


129265 129265 


AA530892 


Hs.171695 


dual specificity phosphatase 1 


129468 129468 


AW410538 


Hs.111779 


secreted protein, acidic, cysteine-rich 


101838 101838 


BE243845 


Hs.75511 


connective tissue growth factor 


129619 129619 


AA209534 


Hs.284243 


tetraspan NET-3 protein 


129762 129762 


AA453694 


Hs.12372 


tripartite motif protein TRIM2 


130018 130018 


AA353093 




metallothloneln 1L 


130178 130178 


U20982 




insulin-like grov/th factor-binding prote 


130431 130431 


AW505214 


Hs.155560 


calnexin 


130553 130553 


AF062649 


Hs.252587 


pituitary tumor-transforming 1 


130639 130639 


AI557212 




ESTs, Moderately similar to I54374 gene 


130686 130686 


BE548267 


Hs.337986 


Homo sapiens cDNA FU10934 fis, clone OV 


130818 130818 


AW190920 


Hs.19928 


hypothetical protein SP329 


130899 130899 


AI077288 


Hs.296323 


serum/glucocorticoid regulated kinase 


131080 131080 


NM_001955 


Hs.2271 


endolhelin 1 


131091 131091 


AJ271216 


Hs.22880 


dipeptidylpeptidase III 


131182 131182 


AI824144 


Hs.23912 


ESTs 


131319 131319 


NNL003155 


Hs.25590 


stanniocalcin 1 


131328 131328 


AW939251 


Hs.25647 




131328 131328 


AW939251 


Hs.25647 


v-fos FBJ murine osteosarcoma viral onco 


131555 131555 


T47364 


Hs.278613 


interferon, alpha-inducible protein 27 


131573 131573 


AA040311 


Hs.28959 




131756 131756 


AA443966 


Hs.31595 




131909 131909 


NM_016558 


Hs.274411 


SCAN domain-containing 1 


132046 132046 


AI359214 


Hs.179260 


chromosome 14 open reading frame 4 


132151 132151 


BE379499 


Hs.173705 


Homo sapiens cDNA: FLJ22050 fis, clone H 


132187 132187 


AA235709 


Hs.4193 


DKFZP58601 624 protein 


132314 132314 


AF1 12222 


Hs.323805 


pinin, desmosome associated protein 


132398 132398 


AA876616 


Hs. 16979 


ESTs, Weakly similar to A43932 mucin 2 p 


132490 132490 


NM_001290 


Hs.4980 


LIM domain binding 2 


132546 132546 


M24283 


Hs. 168383 


intercellular adhesion molecule 1 (CD54) 


132716 132716 


BE379595 


Hs.283738 


casein kinase 1, alpha 1 


132883 132883 


AA373314 




Homo sapiens mRNA; cDNA DKFZp586P1622 (f 


132989 132989 


AA480074 


Hs.331328 


hypothetical protein FLJ13213 


133071 133071 


BE384932 


Hs.64313 


ESTs, Weakly similar to AF2571 82 1 G-pro 


133099 133099 


W16518 


Hs.279518 


amyloid beta (A4) precursor-like protein 


133149 133149 


AA370045 


Hs.6607 


AX1N1 up-regulated 


133200 133200 


AB037715 


Hs.183639 


hypothetical protein FLJ 10210 


133260 133260 


AA403045 


Hs.6906 


Homo sapiens cDNA: FLJ23197 fis, clone R 


133349 133349 


AW631255 


Hs.8110 


L-3-hydroxyacyl-Coenzyme A dehydrogenase 


133398 133398 


NM_000499 


Hs.72912 


cytochrome P450, subfamily I (aromatic c 


133454 133454 


BE547647 


Hs.177781 


hypothetical protein MGC5618 


133491 133491 


BE619053 


Hs.170001 


eukaryotic translation initiation factor 


133517 133517 


NM_000165 


Hs.74471 


gap junction protein, alpha 1, 43kD (con 


133538 133538 


NM_003257 


Hs.74614 


tight junction protein 1 (zona occludens 


133584 133584 


D90209 


Hs.181243 


activating transcription factor 4 (tax-r 


133617 133617 


BE244334 


Hs.75249 


ADP-ribosylation factor-like 6 interacti 


133671 133671 


AW503116 


Hs.301819 


zinc finger protein 146 


133681 133681 


AI352558 




tyrosine 3-monooxygenase/tryptophan 5-mo 


133730 133730 


BE242779 


Hs.179526 


upregulated by 1 ,25-dihydroxyvitamin D-3 


133802 133802 


AW239400 


Hs.76297 


G protein-coupled receptor kinase 6 


133838 133838 


BE222494 


Hs. 180919 


inhibitor of DNA binding 2, dominant neg 


133889 133889 


U48959 


Hs.211582 


myosin, light polypeptide kinase 


133975 133975 


C18356 


Hs.295944 


tissue factor pathway inhibitor 2 


134039 134039 


NM_002290 


Hs.78872 


laminin, alpha 4 


134081 134081 


AL034349 


Hs.79005 


protein tyrosine phosphatase, receptor t 


134203 134203 


AA161219 


Hs.799 


diphtheria toxin receptor (heparin-bindi 


134299 134299 


AW580939 


Hs.97199 


complement component C1q receptor 


134339 134339 


R70429 


Hs.81988 


disabled (Drosophila) homolog 2 (mitogen 


134381 134381 


AI557280 


Hs.184270 


capping protein (actin filament) muscle 


134416 134416 


X68264 


Hs.211579 


melanoma cell adhesion molecule 


134558 134558 


NMJJ01773 


Hs.85289 


CD34 antigen 


134983 134983 




Hs.196384 


prostaglandin-endoperoxide synthase 2 (p 


135052 135052 


AL1 36653 


Hs.93675 


decidual protein induced by progesterone 


135069 135069 


AA876372 


Hs.93961 


Homo sapiens mRNA; cDNA DKFZp567D095 (fr 


135073 135073 


W55956 


Hs.94030 


Homo sapiens mRNA; cDNA DKFZp586E1624 (f 


135196 135196 




Hs.9615 


myosin regulatory light chain 2, smooth 


134404 134404 


AB000450 


Hs.82771 




100082 100082 


M130080 


Hs.4295 


proteasome (proscme, macropain) 26S subu 


130150 130150 


BE094848 


Hs.15113 


homogentisate 1,2-dioxygenase (homogenti 


130839 130839 


AB011169 


Hs.20141 


similartoS.cerevisiae SSM4 


100113 100113 


NM.001269 


Hs.84746 


chromosome condensation 1 


100129 100129 


AA469359 


Hs.5831 


tissue inhibitor of metalloproteinase 1 


100169 100169 


AL037228 


Hs.82043 


D123gene product 


100190 100190 


M91401 


Hs.1 78658 


RAD23 (S. cerevisiae) homolog B 
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100211 100211 


D 26528 


Hs 123058 


DEAD/H (Asp-Glu-Ala-Asp/His) box polypep 


130283 130283 


NM 012288 


Hs. 153954 




100248 100248 


NMJ)15156 


Hs.78398 


KIAA0071 protein 


100262 100262 




Hs 278468 




100281 100281 


AF091035 


Hs 184627 


KIAA01 18 protein^ 


100327 100327 


D55640 




KIAA0143 protein^ ^ 


134495 134495 








135152 135152 


M96954 








NM 014791 


Hs 184339 






DB4284 


Hs 66052 




100418 100418 




Hs.84790 


KIAA0225 protein 


134347 134347 


AF1 64142 






100438 100438 


AAO 13051 






luutoi iuiwoi 




Hs. 121 489 


cystatin D 


100591 100591 


NM 004091 


Hs. 23 1444 




100662 100662 


A1368680 






100905 100905 




Hs.1 7281 6 


neuregulm 1 e n S ion 


100950 100950 




Hs 166846 




135407 135407 


J 04029 




keraiin 10 (epidermolytic yper eratosis 


131877 131877 


J04088 


Hs 156346 




134786 134786 




Hs.89640 




134078 134078 




Hs.78995 


MADS box transcription enhancer factor 


134849 134849 








101152 101152 


AI984625 






131687 131687 








421155 421155 








1 33975 1 33975 




Hs 295944 








Hs 151254 




132813 132813 


BE313625 




solut^carrier famMv'll to r oton-raiJDled 


101300 101300 








130344 130344 




Hs 154879 




101381 101381 


AW675039 






133780 133780 


AA557660 


Hs.7b152 




101447 101447 






gb'Human alpha satellite and satellite 3 


101470 101470 


NM_0 00546 




tumor protein p53 (Li-Fraumeni syndrome) 


101478 101478 


NM_002890 






133519 133519 


AW5 83062 






134116 134116 








130174 130174 




Hs 151531 


rote nDhosDh^ 


132983 132983 








101543 101543 


MVMfifi 


Hs 2050 


nl °tax?n-re"ated aene raoidlv induced b 






Hs 247930 




133595 133595 










D90337 


Hs.247916 


Tri Cn 1 Mp'nrecursor C 


134246 134246 








133948 133948 


X599G0 






133948 133948 


X59960 


Hs.77813 




133948 133948 




Hs.77813 


sphingomyelin phosphodiesterase 1, acid 


101812 101812 


BE439894 






133396 133396 


M96326 


Hs.72885 




129026 129026 


AL1 20297 


Hs. 108043 




134831 134831 








134395 134395 


AA456539 




lysosomal 


101977 101977 




Hs. 184062 




101998 101998 








102007 102007 








416658 416658 




Hs 79432 




135389 135389 




Hs.99872 


fetal Alzheimer antigen 




U34820 


Hs 151051 










alpha thalassemia/mental retardation syn 


102123 102123 








102133 102133 




nS.loooyo 




102162 102162 








427653 427653 


AA159001 


Hs. 180069 


nuclear respiratory factor 1 


102200 102200 


AA232362 


Hs. 157205 


branched chain aminotransferase 1 , cytos 


102214 102214 






SRY (sex determining region Y)-box 1 1 


131319 131319 


NM_0Q3155 


Hs.25590 


stanniocalcin 1 


132316 132316 


U28831 


Hs 44566 


KIAA1641 protein 


134365 134365 


AA568906 


Hs!82240 


syntaxin 3A 


102298 102298 


AA382169 


Hs.54483 


N-myc (and STAT) interactor 


302344 302344 


BE303044 


Hs.192023 


eukaryotic translation initiation factor 


102367 102367 


U39656 


Hs.1 18825 


mitogen-activated protein kinase kinase 


102394 102394 


NM 003816 


Hs.2442 


a disintegrin and metalloproteinase donna 


129521 129521 


AF071076 


Hs.1 12255 


nucleoporin9BkD 


102251 102251 


NMJ)04398 


Hs.41706 


DEAD/H (Asp-Glu-Ala-Asp/His) box polypep 


133746 133746 


r AW410035 


Hs.75862 


MAD (mothers against decapentaplegic, Dr 
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132828 132828 


AB014615 


Hs.57710 


fibroblast growth factor □ (androgen-md 


132828 132828 


AB014615 




fibroblsst growth factor 8 (androgen-ind 


130441 130441 


U63630 


Hs 155637 










Human BRCA2 region mRNA sequence CG006 










102560 102560 




Hs.63984 




134305 134305 




Hs 81424 


ubiqulfln like 1 (sentrta)" 


132736 132736 






Homo sapiens cDNA: FLJ23037 fis, clone L 


102663 102663 


NM_002270 






102735 102735 


AF111106 






101175 101175 




Hs.36980 


melanoma antigen, family A, 2 


132164 132164 


A1752235 






102826 102826 


NM_007274 


Hs.8679 




102846 102846 


BE264974 




thyroid hormone receptor mteractor 1 3 


134161 134161 


AA634543 




IGF-II mRNA-binding protein 3 


302363 302363 


AW163799 


Hs.198365 


2,3-bisphosphoglycerate mutase 


125701 125701 






apolipoprotein A-l 


134656 134656 


AI750878 


Hs.87409 


thrambospondin 1 


102968 102968 


AU07661 1 


Hs. 154672 




134037 134037 


AI808780 


Hs.227730 




103023 103023 


AW500470 






130282 130282 


BE245380 






128568 128568 




Hs.274691 




103093 103093 




Hs 44926 


' C ? t "'j "fj t CDl ienosin 


129063 129063 




Hs.283822 




133227 133227 


AW977263 


Hs.68257 




103184 103184 




Hs.74049 




103208 103208 


AW411340 


Hs.31314 


retinoblastoma-binding protein 7 


131486 131486 








103334 103334 


NM_001260 


Hs.25283 


cyclm-dependent kinase 8 


135094 135094 


NM_003304 


Hs.250687 




103352 103352 


H09366 


Hs.78853 




132173 132173 


X89426 






131584 131584 


AA598509 






103378 103378 


AL1 19690 


Hs.1 53618 




103410 103410 


AA1 58294 


Hs.295362 




103438 103438 


AW1 75781 


Hs. 152720 




103452 103452 


NM_006936 


Hs.85119 




135185 135185 


AW404908 


Hs.96038 


Ric (Drosophila)-like, expressed in many 


134662 134662 


NM007048 


Hs.284283 


butyrophilin, subfamily 3, member A1 


103500 103500 


AW408009 


Hs. 22580 


alkylglycerone phosphate synthase 


132084 132084 


NM_002267 




karyopherin alpha 3 (importin alpha 4) 


133152 133152 




Hs.324473 


mitogen-activated protein kinase 1 


103612 103612 


BE336654 


Hs.70937 




103692 103692 


AW1 37912 


Hs.227583 




129796 129796 


BE218319 






132683 132683 


BE264633 


Hs. 143638 




103723 103723 


BE274312 


Hs. 214783 




133260 133260 


AA403045 






103766 103766 


Al 920783 


Hs. 191435 




132051 132051 




Hs 180145 


HSPC030 protein 


135289 135289 


AW372569 






103794 103794 


AF244135 


Hs. 30670 




134319 134319 


BE304999 


Hs.285754 


fumarate hydratase 


119159 119159 


AF142419 


Hs. 15020 


homoiog of mouse quaking QKI (KH domain 


103850 103850 


AA187101 


Hs.213194 


hypothetical protein MGC1 0895 


322026 322026 


AW024973 


Hs.283675 


NPD009 protein 


103861 103861 


AA206236 


Hs.4944 


hypothetical protein FLJ12783 


447735 447735 


AA775268 




Homo sapiens cDNA: FLJ23020 fis, clone L 


131236 131236 


AF043117 






129013 129013 
103988 103988 


AA371 156 


Hs.107942 


DKFZP564M112 protein 


AA314389 


Hs.342849 


ADP-ribosylation factor-like 5 


425284 425284 




Hs. 348043 


NS1-associated protein 1 


133281 133281 


AK001601 




high-mobility group 20A 


108154 108154 
135073 135073 




Hs 220689 








Homo sapiens mRNA; cDNA DKFZp586E1624 (f 


129593 129593 


AI338247 


Hs 98314 


Homo sapiens mRNA; cDNA DKFZp586L0120 (f 


132064 132064 


AA121098 


Hs.3838 


seium-inducibie kinase 


131427 131427 


AF151879 


Hs.26706 


CGI-121 protein 


104282 104282 


C14448 


Hs.332338 


EST 


130443 130443 
132837 132837 


D25216 


Hs.155650 


KIAA0014 gene product 


AA370362 


Hs.57958 


EGF-TM7-latroph0in-related protein 


104334 104334 


D82614 


Hs.78771 


phosphoglycerate kinase 1 


134731 134731 


D89377 


Hs.89404 


msh (Drasophila) homeo box homoiog 2 


131670 131670 


H03514 


Hs.15589 


ESTs 


104402 104402 


H56731 


Hs.132956 


ESTs 
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129077 129077 


N74724 


Hs.108479 


ESTs 


134927 134927 


L36531 


Hs.91296 


integrin, alpha 8 


134498 134498 


AW246273 


Hs.84131 


threonyl-tRNA synthetase 


104488 104488 


N56191 


Hs.106511 


protocadherin 17 


129214 129214 


AL044335 


Hs.109526 


zinc finger protein 198 


104530 104530 


AK001676 


Hs.12457 


hypothetical protein FU10814 


104544 104544 


AI091173 


Hs.222352 


ESTs, Weakly similar to p40 [H.sapiens] 


104567 104567 


AA040620 


Hs.5672 


hypothetical protein AF140225 


129575 129575 


F08282 


Hs.278428 


progestin induced protein 


104599 104599 


AW815036 


Hs.151251 


ESTs 


104667 104667 


AI239923 


Hs.63931 


ESTs 


104764 104764 


AI039243 


Hs.278585 


ESTs 


104787 104787 


AA027317 




gb:ze97d11,s1 Soares fetal heart_NbHH19W 


104804 104804 


AI858702 


Hs.31803 


ESTs, Weakly similar to N-WASP [H.sapien 


130828 130828 


AW631469 


Hs.203213 


ESTs 


104943 104943 


AF072873 


Hs.114218 


frizzled (Drosophila) homolog 6 


105024 105024 


AA1 26311 


Hs.9879 


ESTs 


105038 105038 


AW503733 


Hs.9414 


KIAA1488 protein 


105096 105096 


AL042506 


Hs.21599 


Kruppel-like factor 7 (ubiquitous) 


105169 105169 


BE245294 


Hs.180789 


S164 protein 


130401 130401 


BE396283 


Hs.173987 


eukaryotic translation initiation factor 


130114 130114 


AA233393 


Hs.14992 


hypothetical protein FLJ11151 


105337 105337 


AI468789 


Hs.347187 


myotubularin related protein 1 


105376 105376 


AW994032 


Hs.8768 


hypothetical protein FU 10849 


131962 131962 


AK000046 


Hs.343877 


hypothetical protein FLJ20039 


128658 128658 


BE397354 


Hs.324830 


diptheria toxin resistance protein requi 


105508 105508 


AA173942 


Hs.326416 


Homo sapiens mRNA; cDNA DKFZp564H1916 (f 


135172 135172 


AB028956 


Hs.12144 


KIAA1033 protein 


132542 132542 


AL1 37751 


Hs.263671 


Homo sapiens mRNA; cDNA DKFZp434l0812 (f 


105659 105659 


AA283044 


Hs.25625 


hypothetical protein FLJ11323 


105674 105674 


AI609530 


Hs.279789 


histone deacetylase 3 


105722 105722 


AI922821 


Hs.32433 


ESTs 


115951 115951 


BE546245 


Hs.301048 


sec13-like protein 


105985 105985 


AA406610 




gb:zv15b10.s1 Soares_NhHMPu_S1 Homosapi 


131216 131216 


AI815486 


Hs.243901 


Homo sapiens cDNA FLJ20738 fis, clone HE 


113689 113689 


AB037850 


Hs.16621 


DKFZP434I1 16 protein 


130839 130839 


AB011169 


Hs.20141 


similar to S.cerevisiae SSM4 


130777 130777 


AW135049 


Hs.26285 


Homo sapiens cDNA FLJ10643 fis, clone NT 


106196 106196 


AA525993 


Hs.173699 


ESTs, Weakly similar to ALU1JHUMAN ALU S 


133200 133200 


AB037715 


Hs.183639 


hypothetical protein FU10210 


106328 106328 


AL079559 


Hs.28020 


KIAA0766 gene product 


106423 106423 


AB020722 


Hs.16714 


Rho guanine exchange factor (GEF) 15 


439608 439608 


AW864696 


Hs.301732 


hypothetical protein MGC5306 


106503 106503 


AB033042 


Hs.29679 


cofactor required forSpl transcriptiona 


106543 106543 


AA676939 


Hs.69285 


neuropilin 1 


106589 106589 


AK000933 


Hs.28661 


Homo sapiens cDNA FLJ10071 fis, clone HE 


106596 106596 


AA452379 




ESTs, Moderately similarto ALU7.HUMAN A 


106636 106636 


AW958037 


Hs.286 


ribosomal protein L4 


131353 131353 


AW754182 




gb:RC2-CT0321-131199-011^1 CT0321 Homo 


131710 131710 


NM 015368 


Hs.30985 


pannexin 1 


131775 131775 


AB014548 


Hs.31921 


KIAA0648 protein 


106773 106773 


AA478109 


Hs.188833 


ESTs 


106817 106817 


D61216 


Hs. 18672 


ESTs 


106848 106848 


AA449014 


Hs.121025 


chromosome 11 open reading frame 5 


418699 418699 


BE539639 


Hs.173030 


ESTs, Weakly similar to ALU8_HUMAN ALU S 


130638 130638 


AW021276 


Hs.17121 


ESTs 


107059 107059 


BE614410 


Hs.23044 


RAD51 (S. cerevisiae) homolog (E coli Re 


107115 107115 


BE379623 


Hs.27693 


peptidylprolyi isomerase (cyclophilin)-l 


107156 107156 


AA137043 


Hs.9663 


programmed cell death 6-interacting prot 


130621 130621 


AW513087 


Hs.16803 


LUC7 (S. cerev!siae)-like 


132626 132626 


AW504732 


Hs.21275 


hypothetical protein FLJ11011 


131610 131610 


AA357879 


Hs.29423 


scavenger receptor with C-type lectin 


107295 107295 


AA186629 


Hs.80120 


UDP-N-acetyl-alpha-D-galactosamine:polyp 


107315 107315 


AA316241 


Hs.90691 


nucleophosmin/nucleoplasmin 3 


107328 107328 


AW959891 


Hs.76591 


KIAA0887 protein 


134715 134715 


U48263 


Hs.89040 


prepronociceptin 


129938 129938 


AW003668 


Hs.135587 


Human clone 23629 mRNA sequence 


130074 130074 


AL038596 


Hs.250745 


polymerase (RNA) III (DNA directed) (62k 


132036 132036 




Hs.37706 


hypothetical protein DKFZp434E2220 


113857 113857 


AW243158 


Hs.5297 


DKFZP564A2416 protein 


130419 130419 


AF037448 


Hs.155489 


NS1-associated protein 1 


132616 132616 


BE262677 


HS.2B3558 


hypothetical protein PR01855 


132358 132358 


NMJ03542 


Hs.46423 


H4 histone family, member G 


125827 125827 


NM 003403 


Hs.97496 


YY1 transcription factor 


107609 107609 


R75654 


Hs.164797 


hypothetical protein FLJ1 3693 


107714 107714 


AA015761 


Hs.60542 


ESTs 
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107832 


107832 


AA021473 




gb:ze66o11.s1 Soares retina N2b4HR Homo 


124337 


124337 


N23541 


Hs.281561 


Homo sapiens cDNA: FLJ23582 fis, clone L 


129577 


129577 


N75346 


Hs.306121 


CDC20 (cell division cycle 20, S. cerevi 


132000 


132000 


AW247017 


Hs.36978 


melanoma antigen, family A, 3 


107935 


107935 


AA029428 


Hs.61555 


ESTs 


131461 


131461 


AA992841 


Hs.27263 


K1AA1458 protein 




108029 


AA040740 


Hs.62007 




108084 


108034 


AA058944 


Hs'.1 16602 


Homo sapiens, clone IMAGE:41 54008, mRNA 


108168 


108168 


AI453137 


Hs.63176 


ESTs 


108189 


108189 


AW376061 


Hs.63335 


ESTs, Moderately similar to A46010 X-lin 


108203 


108203 


AW847814 


Hs.289005 


Homo sapiens cDNA: FLJ21532 fis, clone C 


108217 


108217 


AA058686 


Hs.62588 


ESTs 


108277 


108277 


AA064859 




gb:zm50f03.s1 Stratagene fibroblast (937 


108309 


108309 


AA069818 


Hs.180909 


gb:zm67e03,r1 Stratagene neuroepithelium 


108340 
108427 


108340 
108427 


AA069820 
AA076382 




peroxiredoxin 1 

gb:zm91gOB.s1 Stratagene ovarian cancer 


108439 


108439 


AA078986 




gb:zm92h01.s1 Stratagene ovarian cancer 


108469 


108469 


AA079487 




gb:zm97f08.s1 Stratagene colon HT29 (937 


108501 


108501 


AA083256 




gb:zn08g12.s1 Stratagene hNT neuron (937 


108562 


108562 


AA100796 




gb:zm26c06,s1 Stratagene pancreas (93720 


130890 


130890 


AI907537 


Hs.76898 


stress-associated endoplasmic reticulum 


130385 


130385 


AW067800 


Hs.1 55223 


stanniocalcin 2 


108807 


108807 


A1652236 


Hs.49376 


hypothetical protein FLJ20644 


108833 


108833 


AF188527 


Hs.61661 


ESTs, Weakly similar to AF174605 1 F-box 


108846 


108846 


AL1 17452 


Hs.44155 


DKFZP586G1517 protein 


131474 


131474 


L46353 


Hs.2726 


high-mobility group (nonhistone chromoso 


108941 


108941 


AA148650 






108996 


108996 


AW995610 


Hs.332436 


EST 


131183 


131183 


AI611807 


Hs.285107 


hypothetical protein FU13397 


109022 


109022 


AA1 57291 


Hs.21479 


ubinuolein 1 


109068 


109068 


AA164293 


Hs.72545 


ESTs 


129021 


129021 


AL044675 


Hs.173081 


KIAA0530 protein 


109146 


109146 


AA176589 


Hs.142073 


EST 


131080 


131080 


NM 001955 


Hs.2271 


endothelin 1 


109222 


109222 


AA1 92833 


Hs.333512 


similar to rat myomegalin 


109481 


109481 


AA878923 


Hs.289069 


hypothetical protein FLJ21016 


109516 


109516 


AI471639 






109556 


109556 


AI925294 


Hs.87385 


ESTs 


109578 


109578 


F02208 


Hs.27214 


ESTs 


109625 


109625 


H29490 


Hs.22697 


ESTs 


109648 


109648 


H17800 


Hs.7154 


ESTs 


109699 


109699 


H18013 


Hs.167483 


ESTs 


109933 


109933 


R52417 


Hs.20945 


Homo sapiens clone 24993 mRNA sequence 


110039 


110039 


H11938 


Hs.21907 


histone acetyltransferase 
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TABLE 2A 

Table 2A shows the accession numbers for those pkeys lacking unigenelD's for Table 2. The pkeys in Table 7 lacking unigenelD's are represented within 
Tables 1 -6A. For each probeset we have listed the gene cluster number from which the oligonucleotides were designed. Gene clusters were compiled 
5 using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 

Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



10 Pkey: Unique Eos probeset identifier number 

CAT number: Gene cluster number 
Accession: Genbank accession numbers 



15 Pkey 



108501 
108562 
101300 



CAT Number Accession 



116761J 
13684_-12 
36375J 



AA079487 AA128547 AA128291 AA079587 M079600 



AA1 00795 AF020589 AA074629 AA075946 M100849 AA085347 AA 126309 A A079311 AA079323 AA085274 
BE535511 M62098 M306787 AW891766 AA348998 AA338869 AA344013 AW956561 AW389343 AW403607 L40391 
AW408435 AA121738 AI568978 H13317 R20373 AW948724 AW948744 M335023 AA436722 AA448690 C21404 
AW884390 AA345454 AA303292 AA174174 BE092290 T90614 AA035104 R76028 M126924 AA741086 AW022056 
AW118940 AA121666 AI832409 AA683475 AI140901 A1523576 AW519064 AW474125 AI953923 AI735349 AW150109 
AI436154 AW1 18130 AW270782 AI804073 N27434 AA876543 AA937815 AI051166 AA505378 AI041975 AI335355 
AI089540 AA662243 AI127912 AI925604 AI250880 AI366874 AI564386 AI815196 AI683526 AI435885 AI160934 H79030 
AI801493 AA448691 AI673767 AI076042 AI804327 AA813438 AA680002 AI274492 T16177 AI287337 AI935050 
AA907805 AA911493 AI589411 AI371358 AW576236 AID78866 AW516168 AA346372 AI560185 AA471009 R75857 
AA296025 AA5231 55 AA853168 AI696593 AI658482 AI566601 AW072797 AA128047 AA035502 AW243274 AA992517 
R43760 

W73853 AA9281 12 W77887 AW889237 AA148524 AI749182 AI754442 AI338392 AI253102 AI079403 AI370541 AI697341 
H97538 AW188021 AI927669 W72716 AI051402 AI188071 AI335900 N21488 AW770478 W92522 AI691028 AI913512 
AI144448 W73819 AA604358 N28900 W95221 AI868132 H98465 AA148793 



117155 145392J 



132983 1 1922J M30269 NMJ02508 X82245 AI078760 AW957003 D78945 M27445 M650439 AL048816 AVS60256 AV660347 

AA333052 BE295257 T60999 AA383049 AW3G9677 Z26985 AW175704 AA343326 AW747957 AI818389 W17308 
W17302 H15591 AA371284 AA370412 W94966 BE384365 T28498 R80714 R15959 H21723 AW835154 D56097 D56331 
W21232M190565 AW379755 AW0B7895 

133681 13893 1 AI352558 Z82248 X78138 NM J03405 AU077248 AA223125 S80794 D78577 A1124697 AW403970 BE614089 BE296713 

BE621334 L20422 X80536 D54224 D54950 X57345 N29226 AA127798 AA340253 F08031 M192540 H6763G AA321827 
AW950283 M084159 BE538808 AW401377 AA256774 C03366 W45595 W47608 AA3O5Q09 H69431 H69456 AL120082 
H11706 AA303717 AA361357 H22042 H78020 AW999584 AA1 34368 AA322911 AA322961 H60980 N85248 N31547 
H79624 T11718 W85826 AW894663 AW894624 BE167441 BE170015 AA304626 AW602163 AW998929 AA156681 
AA151067 BEC02724 AA608688 H82692 BE155392 AW383636 BE155394 AA487004 AW383504 AI342365 R82553 
W16498BE155344AI143938 R69901 AA322873 AW340648 R25364 AA367935AI559406 M033522AA374252 
AW835019 AI922133 AI697089 N99662 AW189078 AI199076 AW151598 W59944 AA662875 W94022 AA299055 
AI039008 A1829449 AA583503 AI635674 AW131665 AI473820 AW2731 18 AW900930 AA90B944 AI688035 AW170272 
AI082545 AW468176 AI608761 AI082748 AI911682 AI248943 AI831016 M192465 AI218477 AA938405 AA385288 
AI809817 AA905196 AI191245 AI470204 A1188296 AI421367 A1125315 AI087141 AA629032 AA740589 AI554181 
AA150830 AI248541 AI077943 AA775958 AA864930 AI261476 AI123121 AI310394 AA862331 AA872478 BE537084 
AI205606 AA720684 AI872093 AW150042 AL120538 AA219627 AA988608 C21397 AI359337 H25337 AI089749 
AA605146 AI359620 AA1 50478 AI359738 AW383642 AW995424 AI766457 R55892 AI089839 W61343 N69107 W46459 
AA565955 N20527 AI279782 W46596 M776573 H23204 AI856231 AI083995 NI21530 AA12S874 D82630 W65437 
AI086917 AW382095 AI086877 H69844 AW340217 W85827 L08439 AA262704 AA505380 W47413 W94135 AA223241 
AW089153 AA084101 BE538000 AA095126 T28031 AA491574 R84813 AA774536 AW383522 AA155615 AW383529 
AA491520 AW028427 AA171496 AI469689 AW664539 AI811102 AI81 1116 BE464590 BE350791 H78021 T15405 H21979 
AA219489 H13301 AA505883 AI864305 AI423963 AW084401 F04963 R69B58 H67097 AI917740 AI655561 H69864 
AA033631 AW383484 AI8862S1 H25293 AA513281 AW271187 H1 1617 N79982 AI174338 AI904207 A1904208 BE614558 
W94127 W65436 AI272249 AA700018 AI579932 AI085941 AW152629 

121335 279548J AA404418 AI217248 

130018 18986J AA353093 AW957317 AW872498 AI560785 A12891 1 0 AW135512 X97261 T68873 

121822 244391J AI743860 N49543 AW027759 BE349467 AI656284 BE463975 R35022 AA370031 AW955302 AL042109 N53092 AI61 1424 
AL079362 AI969290 AI928016 BE394912 BE504220 BE467505 AI61 1611 AI611407 AI611452 W56437 AI284566 
AI583349 AW183058 AI308085 AI074952 AA437315 AA628161 AW301728 AI150224 AA400137 AA437279 AI223355 
AA639462 AI261373 AI432414 AI984994 AI539335 AA401550 AA358757 AI609976 AA442357 AA359393 AA437046 
AA370301 AA429328 AW272055 AI580502 AI832944 AI038530 AA425107 AI014986 AI148349 AW237721 AW779756 
AW1 37877 A1125293 AA400404 R28554 

108309 111495J AA069818 M069971 AA069923 M069908 

107832 genbank_AA021473 AA021473 

123523 genbankJ\A608588 AA608588 

123964 genbank_C13961 C13961 

118475 genbank N66845 N66845 

104787 genbank_AA027317 AA027317 

106596 304084J AI583948 AA578212 AW303715 AA653450 M456981 A1400385 W88533 AI2241 33 AW272145 AA088686 R94698 

113947 genbank_W84768W84768 

108277 genbank_M064859 AA064859 
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108427 genbank_AA076382 AA076382 

108439 genbank_AA078986 AA078986 

131353 231290J AW411259 H23555 AW015049 AI684275 AW015886 AW068953AW014085 AI027260 R52686 AA918278 AI129462 

AA969360 N34869 AI948413 AA534205 M702483AA705292 

101447 entrez_M21305 M21305 

108931 genbank_AA147186 AA147186 

108941 genbank AA14B650 AA148650 

103138 entrez_X65965 X65965 

119174 genbank_R71234 R71234 

119416 genbank_T97186 T9718S 

105985 genbank_AA406610 AA406610 

100327 entrez_D55640 D55640 
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TABLE 3: 

Pkey: Unique Eos probeset identifier number 

Accession: Accession number used for previous patent filings 
ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unlgene number 
Unigene Title: Unigene gene title 



Pkey Accession ExAccn UniGene UnigeneTitle 



100405 D86425 
100420 D86983 
100481 HG1098-HT1098 
100484 HG1103-HT1103 
100718 HG3342-HT3519 
100991 J03764 
101097 L06797 
101168 L15388 
101194 L20971 
101261 L35545 
101345 L76380 
101447 M21305 
101485 M24736 
101543 M31166 
101550 M31551 
101560 M32334 
101674 M61916 
101714 M68874 
101741 M74719 
101838 M92934 
101857 M94856 
102012 U03057 
102024 U03877 
102164 U18300 
102241 U27109 
102283 U31384 



102663 U70322 

102759 U81607 

102778 U83463 

102804 U89942 



102915 X07820 

103036 X54925 

103037 X54936 
103095 X60957 
103158 X67235 
103166 X67951 
103185 X69910 
103280 X79981 
103554 Z18951 
103850 AA187101 
104465 N24990 
104592 R81003 
104764 AA025351 
104786 AA027168 
104850 AA040465 
104865 AA045136 
104894 AA054087 
104952 AA071089 
104974 AA085918 
105178 AA187490 
105263 AA227926 
105330 AA234743 
105376 AA236559 
105729 AA292694 
105826 AA398243 
105977 AA406353 
106008 AA411465 
106031 AA412284 
106124 AA423987 



AW291587 Hs.82733 
D86983 Hs.1 18893 
X70377 Hs.121489 
NM_005402Hs.2B8757 
BE295928 Hs.75424 
J03836 Hs.82085 
BE245301 Hs.89414 
NM_005308Hs.211569 
L20971 Hs.188 
D30857 Hs.82353 
NO05795HS.152175 
M21305 

AA296520 Hs.89546 
M31166 Hs.2050 
Y00630 Hs.75716 
AW958272 Hs.347326 
NM_002291Hs.82124 
M68874 Hs.211587 
NM_003199Hs.326198 
BE243845 Hs.75511 
BE550723 Hs.153179 
BE259035 Hs.118400 
AA301867 Hs.76224 
NM_000107Hs.77602 
NM_007351Hs.268107 
AW161552 Hs.83381 
U33053 Hs.2499 
U59423 Hs.79067 
NM_002270Hs.168075 
NM 005100HS.788 
AF000652 Hs.8180 
NM_002318Hs.83354 
J03836 Hs.82085 
NM_002205Hs.149609 
X07820 Hs.2258 
M13509 Hs.83169 
BE018302 Hs.2894 
NM_005424Hs.78824 
BE242587 Hs.1 18651 
AA159248 Hs.180909 
NMJ06825HS.74368 
U84722 Hs.76206 
AI878826 Hs.74034 
AA187101 Hs.213194 
Z44203 Hs.26418 
AW630488 Hs.25338 



AA027167 Hs.10031 

AL133035 Hs.8728 

T79340 Hs.22575 

AF065214 Hs.1 8858 

AW076098 Hs.345588 

Y12059 Hs.278675 

AA313825 Hs.21941 



AW338625 Hs.22120 
AW994032 Hs.8768 
H46612 Hs.293815 
AA478756 Hs.194477 
AK001972 Hs.30822 
AB033888 Hs.8619 
X64116 Hs.171844 
H93366 Hs.7567 



v-ral simian leukemia viral oncogene horn 
inhibitor of DNA binding 1, dominant neg 
serine (or cysteine) proteinase inhibito 
ohemokine (C-X-C motif), receptor 4 (fus 
G protein-coupled receptor kinase 5 
phosphodiesterase 4B, cAMP-specific (dun 
protein C receptor, endothelial (EPCR) 
calcitonin receptor-like 
gkHuman alpha satellite and satellite 3 
selectin E (endothelial adhesion molecul 
pentaxin-related gene, rapidly induced b 
serine (or cysteine) proteinase inhibito 
intercellular adhesion molecule 2 
laminin, beta 1 

phospholipase A2, group iVA (cytosolic, 
transcription factor 4 
connective tissue growth factor 
fatty acid binding protein 5 (psoriasis- 
singed (Drosophila)-iike (sea urchin fas 
EGF-containing fibulin-like extracellula 
damage-specific DNA binding protein 2 (4 



guanine nucleotide binding protein 11 

protein kinase C-like1 

MAD (mothers against decapentaplegic, Dr 

karyopherin (importin) beta 2 

A kinase (PRKA) anchor protein (gravin) 

syndecan binding protein (syntenin) 

lysyl oxidase-like 2 

serine (or cysteine) proteinase inhibito 

integrin, alpha 5 (fibroneciin receptor 

matrix metalloproteinase 10 (stromelysin 

matrix metalloproteinase 1 (interstitial 

placental growth factor, vascular endoth 

tyrosine kinase with immunoglobulin and 

hematopoietically expressed homeobox 

peroxiredoxin 1 

transmembrane protein (63kD), endoplasmi 
cadherin 5, type 2, VE-cadherin (vascula 
caveolin 1, caveolae protein, 22kD 
hypothetical protein MGC10895 
ESTs 

protease, serine, 23 
ESTs 

KIAA0955 protein 

hypothetical protein DKFZp434G171 
B-cell CLL/lymphoma 6, member B (zinc fi 
phospholipase A2, group IVC (cytosolic, 
desmopiakin (DPI, DPII) 
bromodomaln-containing 4 
AD036 protein 

solute carrier family 7, (cationic amino 
ESTs 

hypothetical protein FLJ10849 

Homo sapiens HSPC285 mRNA, partial cds 

E3 ubiquitin ligase SMURF2 

hypothetical protein FLJ11110 

SRY (sex determining region Y)-box 18 

Homo sapiens cDNA: FLJ22296 fis, clone H 

Homo sapiens cDNA: FU21962 fis, clone H 
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106155 AA425309 


AA425414 


Hs.33287 


nuclear factor l/B 


106302 AA435896 


AA398859 


Hs.18397 


hypothetical protein FLJ23221 


106423 AA448238 


AB020722 


Hs.16714 


Rho guanine exchange factor (GEF) 15 


106793 AA478778 


H94997 


Hs.16450 


ESTs 


107174 AA621714 


BE122762 


Hs.25338 


ESTs 


107216 D51069 


D51069 


Hs.211579 


melanoma cell adhesion molecule 


107295 T34527 


AA186629 


Hs.80120 


UDP-N-acetyl-a!pha-D-ga!actosamine:polyp 


107385 U97519 


NM_005397Hs.16426 


podocalyxin-like 


108756 AA1 27221 


AA1 27221 


Hs.1 17037 


ESTs 


108846 AA1 32983 


AL1 17452 


Hs.44155 


DKFZP586G1517 protein 


108888 AA135606 


AA135606 


Hs.189384 


gb:zl10a05.s1 Soares_pregnant_utems_NbH 


109001 AA156125 


AI056548 


Hs.72116 


[ /pothetic Ifi einFI I20E , imila I 


109166 AA179845 


AA219691 


Hs.73625 


RAB6 interacting, kinesin-like (rabkines 


109456 AA232645 


AW956580 


Hs.42699 


ESTs 


109768 F10399 


F06838 


Hs.14763 


ESTs 


110107 H16772 


AW151660 Hs.31444 


ESTs 


110906 N39584 


AA035211 


Hs.17404 


ESTs 


110984 N52006 


AW613287 Hs.80120 


UDP-N-acery!-alpha-D^alactosamine:polyp 


111006 N53375 


BE387014 


Hs.166146 


Homer, neuronal immediate early gene, 3 


111018 N54067 


AI287912 


Hs.3628 


mitogen-activated protein kinase kinase 


111133 N64436 


AW580939 


Hs.97199 


complement component C1q receptor . 


111760 R26892 


BE551929 


Hs.268754 


Homo sapiens cDNA FLJ1 1949 lis, clone HE 


113073 T33637 


N39342 


Hs.103042 


microtubule-associated protein 1B 


113195 T57112 


H83265 


Hs.8881 


ESTs, Weakly similar to S41044 chromosom 


113923 W80763 


AW953484 


Hs.3849 


hypothetical protein FLJ22041 similar to 


114521 AA046808 


AW1 39036 


Hs. 108957 


40S ribosomal protein S27 isoform 


115061 AA253217 


AI751438 


Hs.41271 


Homo sapiens mRNA full length insert cDN 


115096 AA255991 


AI683069 


Hs.175319 


ESTs 


115145 AA258138 


AA740907 


Hs.88297 


ESTs 


115819 AA426573 


M486620 


Hs.41135 


endomucin-2 


115947 AA443793 


R47479 


Hs.94761 


KIM1691 protein 


116314 AA490588 


AI799104 


Hs.178705 


Homo sapiens cDNA FLJ11333 fis, clone PL 


116339 AA496257 


AK000290 


Hs.44033 


dlpeptidyl peptidases 


116430 AA609717 


AK001531 


Hs.66048 


hypothetical protein FLJ10669 


116589 D59570 


AI557212 


Hs.17132 


ESTs, Moderately similar to I54374 gene 


116733 F13787 


AL157424 


Hs.61289 


synaptojanin 2 


117023 H88157 


AW070211 


Hs.102415 


Homo sapiens mRNA; cDNA DKFZp585N0121 (f 


117186 H98988 


H98988 


Hs.42612 


ESTs, Weakly similar to ALU1_HUMAN ALU S 


117563 N34287 


AF055634 


Hs.44553 


unc5 (C.elegans homolog) c 


117997 N52090 


N52090 


Hs.47420 


EST 


118475 N66845 


N66845 




gb:za46c11.s1 Soares fetal liver spleen 


118581 N68905 


N68905 




g b:za69b09.s1 SoaresJetalJung_NbHL19W 


119073 R32894 


BE245360 


Hs.279477 


ESTs 


119155 R61715 


R61715 


Hs.310598 


ESTs, Moderately similar to ALU1_HUMAN A 


119174 R71234 


R71234 




gb:yi54c08.s1 Soares placenta Nb2HP Homo 


119221 R98105 


C14322 


Hs.250700 


tryptase beta 1 


119416 T97186 


T97186 




gb:ye50h09.s1 Soares fetal liver spleen 


119866 W80814 


AA496205 


Hs.193700 


Homo sapiens mRNA; cDNA DKFZp586l0324 (f 


121335 AA404418 


AA404418 




gb:zw37e02.s1Soares_totalJetus_Nb2HF8_ 


121381 AA405747 


AW088642 Hs.97984 


hypothetical protein FLJ22252 similar to 


123160 AA488687 


AA488687 


Hs.284235 


ESTs, Weakly similar to I38022 hypotheti 


123473 AA599143 


AA599143 




gb:ae52d04.s1 Stratagene lung carcinoma 


123523 AA608588 


AA608588 




gb:ae54e06.s1 Stratagene lung carcinoma 


123533 AA608751 


AA608751 




gb:ae56h07.s1 Stratagene lung carcinoma 


123964 C13961 


C13961 




gb;C13961 Clontech human aorta polyA+mR 


'124006 D60302 


AI147155 


Hs.270016 


ESTs 


124315 H94892 


NM 00540 


2HS.288757 


v-ral simian leukemia viral oncogene horn 


124659 N93521 


AI680737 


Hs.289068 


Homo sapiens cDNA FLJ11918 Hs, clone HE 


124669 N95477 


AI571594 


Hs.1 02943 


hypothetical protein MGC12916 


124847 R60044 


W07701 


Hs.304177 


Homo sapiens clone FLBB503 PR02286 mRNA, 


124375 R70506 


AI887664 


Hs.285814 


sprouty (Drosophila) homolog 4 


125091 T91518 


T91518 




gb:ye20f05,s1 Stratagene lung (937210) H 


125103 T95333 


AA570056 


Hs.1 22730 


ESTs, Moderately similar to KIAA1215 pro 


125355 R45630 


R60547 


Hs.170098 


KIAA0372 gene product 


125565 R20839 


R20840 




gb;yg05c08.r1 Soares infant brain 1N1B H 


125590. R23858 


R23858 


Hs.143375 


Homo sapiens, clone 1MAGE:3840937, mRNA, 


126511 AI024874 


T92143 


Hs.57958 


EGF-TM7-latrophilin-related protein 


126563 W26247 


AA516391 


Hs.181368 


U5 snRNP-specific protein (220 kD), orth 


126649 AA856990 


AA001860 


Hs.279531 




126872 AA136653 


AW450979 


gb:UI-H-BI3-ala-a-12-0-Ul.s1 NCI_CGAP_Su 


127402 AA358869 


AA358869 


Hs.227949 


SEC 13 (S. cerevisiae)-like 1 


127651 AI123976 


AA382523 


Hs.105689 


MSTP031 protein 


127759 AI369384 


AI369384 


Hs.292441 


ESTs 


128062 AA379500 


AA379621 


Hs.105547 


neural proliferation, differentiation an 


128992 R49693 


H04150 


Hs.107708 


ESTs 


129046 AA195678 


AB029290 


Hs.108258 


actin binding protein; macrophin (microf 



121 



WO 02/079492 



PCT/US02/04915 



129188 M30257 




vsscular cell adhesion molecule 1 






mesoderm development candidate 1 






von Willebrand factor 




AW4 10538 Hs.1 11779 


secreted protein, acidic, cystsine-rich 


129765 M86933 


M86933 Hs 1238 


amelogenin (Y chromosome) 




AA012848 Hs. 12570 


tubulin-specifc chaperone d 




AF055581 Hs.13131 


lysosomal 


1oU495 AA243278 


AW250380 Hs. 109059 


mitochondrial ribosomal protein L12 


130639 D59711 


AI557212 Hs 17132 




130657 T94452 


AW337575 Hs.201591 




130828 AA053400 


AW631469 Hs.203213 




130972 AA370302 


D81866 Hs.21739 


Homo sapiens mRNA; cDNA DKFZp586!1518 (f 
endothelm 1 


131080 J05008 


NM_001955Hs.2271 


131137 U85193 


W27392 Hs.33287 


nuclear factor l/B 


131182 AA256153 


AI824144 Hs. 23912 




131486 X83107 


F06972 Hs.27372 


BMX non-receptor tyrosine kinase 


131573 AA046593 






131647 AA410480 






131756 D45304 






131859 M90657 


A W9 60564 


t b 4 rf 'I 


131881 AA010163 




upstream regulatory element binding prot 














132164 U84573 


AI752235 Hs 41270 


I ' 0 






H4 histone family, member G 


132413 AA132969 


AW361383 Hs.260116 


metal loprotease 1 (pitrilysin family) 


132456 AA1 14250 


AB011084 Hs.48924 


KIAA0512 gene product; ALEX2 


132490 F13782 


NM_001290Hs.4980 


LIM domain binding 2 


132676 AA283035 


N92589 Hs.261038 


ESTs, Weakly similar to 138022 hypotheti 


132687 AB002301 


AB002301 Hs.54985 


KIAA0303 protein 


132718 AA056731 


NM_004600Hs.554 


Sjogren syndrome antigen A2 {60KD, ribon 


132736 U68019 


AW081883 Hs.211578 


Homo sapiens cDNA: FU23037 fis, clone L 




AA125985 Hs.56145 


thymosin, beta, identified in neuroblast 


1 32933 AA598702 


BE263252 Hs.6101 


hypothetical protein MGC3178 


132968 N77151 


AF234532 Hs 61638 




132994 AA505133 


AA1 1 2748 Hs 279905 


clone HQ0310 PRO0310p1 


133061 AB000584 


AI186431 Hs 296638 




133147 D12763 


AA026533 Hs 66 


inte r leukin 1 receptor-like 1 


133161 AA253193 


AW021103 Hs.6631 


hypothetical protein FLJ2Q373 


-mono a&awjar 


AbUo/ / ID MS.lfidaoU 


hypothetical protein FLJ10210 




AA4U0U4D nS.oyUa 


Homo sapiens cDNA: FLJ231 97 fis, clone R 


133363 AA479713 


AI866286 Hs.71962 


ESTs, Weakly similar to B36298 proline-r 


Icj^yi L4Uoao 


BE619053 Hs 170001 


eukaryotic translation initiation factor 


133517 X52947 


N M_000 1 65 Hs.74471 


gap junction protein, aipha 1, 43kD (con 






vesicle-associated membrane protein 5 (m 


133607 M34539 


BE273749 


FK506-binding protein 1A (12kD) 


133614 D67029 


NM_003003Hs. 75232 


SEC14 (S. cerevisiae)-like 1 


133627 U09587 


NM_002047Hs.75280 


glycyl-tRNA synthetase 


1 33691 M85289 


M85289 Hs 211573 




133696 D10522 


AI878921 Hs 75607 


mynstoylated alanine-ncn protein kmas 


133913 W84712 


AU076964 Hs.7753 


calumenin 


133975 D29992 


C18356 Hs.295944 


tissue factor pathway inhibitor 2 


133985 L34657 


L34657 Hs.78146 


platelet/endothelial cell adhesion molec 


1 34039 S78569 


NM_002290Hs. 78672 


laminin, alpha 4 


134088 D43636 


AI379954 Hs. 79025 


KIAA0096 protein 


134161 U97188 


AA634543 Hs.79440 


IGF-II mRNA-binding protein 3 


134299 AA487558 


AW580939 Hs.97199 


complement component C1q receptor 


134416 M28882 


X68264 Hs.211579 


melanoma cell s iheslon molecule 


134453 X70683 


A1272141 Hs.83484 


SRY (sex determining region Y)-box 4 
thrombospondin 1 


104OQ0 Al4/0f 


Al/OUofO nS.o/4Ua 


134989 AA236324 


AW968058 Hs.92381 


nudix (nucleoside diphosphate linked moi 


135051 C15324 


AI272141 Hs.83484 


SRY (sex determining region Y)-box 4 


135073 AA4520D0 


W55956 Hs, 94030 


Homo sapiens mRNA; cDNA DKF2p586E1624 (f 


135349 D83174 


AA1 14212 Hs,9930 


serine (or cysteine) proteinase inhibito 


100114 D00596 


X02308 Hs 82962 




100130 D 1 1 428 


NM_0Q0304Hs 103724 


peripheral myelin protein 22 


100143 D13640 


AU076465 Hs. 278441 


KIAA0015 gene product 


100163 D14874 


H73444 Hs.394 


adrenomedullin 


100208 D26129 


NM_002933Hs.78224 


ribonuclease, RNase A family, 1 (pancrea 


100224 D28476 


AL121516 Hs.138617 


thyroid hormone receptor interactor 12 


100405 D86425 


AW291587 Hs.82733 


nidogen 2 


100420 D86983 


D86983 Hs.1 18893 


Melanoma associated gene 


100455 D87953 


AW888941 Hs.75789 


N-myc downstream regulated 


100529 HG1862-HT18 


97 BE313693 Hs.334330 


calmodulin 2 (phosphorylase kinase, delt 


100618 HG2614-HT2710 AI752163 Hs.114599 


collagen, type VIII, alpha 1 


100619 HG2639-HT2735 N24433 Hs.241567 


RNA binding motif, single stranded inter 
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100658 


HG2B55-HT2995 U56725 Hs.180414 


heat shock 70kD protein 2 


100676 


HG3044-HT3742 X02761 Hs.287820 


fibronectin 1 


100718 


HG3342-HT351E 


I BE295928 Hs.75424 


inhibitor of DNA binding 1, dominant neg 


100752 


HG3543-HT3739 T81309 


insulin-like growth factor 2 (somatomedi 


100828 


HG4069-HT4339 AL048753 Hs.303649 


small inducible cytokine A2 (monocyte ch 


100850 


HG417-HT417 


AA836472 Hs.297939 


cathepsin B 


100991 


J03764 


J03836 Hs.82085 


serine (or cysteine) proteinase inhibito 


101097 


L06797 


BE245301 Hs.89414 


chemokine (C-X-C motif], receptor 4 (fus 


101110 


L08246 


AI439011 Hs.86386 


myeloid cell leukemia sequence 1 (BCL2-r 


101142 


L12711 


L12711 Hs.89643 


transketolase (Wernicke-Korsakoff syndro 


101156 


L13977 


AA340987 Hs.75693 


procarboxypeptidase (angiotensinase C 


101168 


L15388 


NM_005308Hs.211569 


G protein-coupled receptor kinase 5 


101184 


L19871 


NM_001674Hs.460 


activating transcription factor 3 


101192 


L20859 


BE247295 Hs.78452 


solute carrier family 20 (phosphate tran 


101317 


L42176 


L42176 Hs.8302 


four and a half LIM domains 2 


101336 


L49169 


NM_006732Hs.75678 


FBJ murine osteosarcoma viral oncogene h 


101345 


L76380 


NM_005795Hs.152175 


calcitonin receptor-like 


101400 


M15990 


M15990 Hs.194148 


v-yes-1 Yamaguchi sarcoma viral oncogene 


101475 


M23254 


BE410405 Hs.76288 


calpain 2, (m/ll) large subunit 


101485 


M24736 


AA296520 Hs.89546 


selectin E (endothelial adhesion molecul 


101496 


M26576 


X12784 Hs.119129 


collagen, type IV, alpha 1 


101505 


M27396 


AA307680 Hs.75692 


asparagine synthetase 


101543 


M31166 


M31166 Hs.2050 


pentaxin-related gene, rapidly induced b 


101557 


M31994 


BE293116 Hs.76392 


aldehyde dehydrogenase 1 family, member 


101560 


M32334 


AW958272 Hs.347326 


intercellular adhesion molecule 2 


101587 


M35878 


AI752416 Hs.77326 


insulin-like growth factor binding prote 


101592 


M36429 


AF064853 Hs.91299 


guanine nucleotide binding protein (G pr 


101633 


M57730 


NM 004428HS.1624 


ephrin-A1 


101634 


M57731 


AV650262 Hs.75765 


GR02 oncogene 


101667 


M60858 


NM.005381 


nuoleolin 


101682 


M62994 


AF043045 Ha.81008 


filamin B, beta (actin-binding protein-2 


101714 


M68874 


M68874 Hs.211587 


phospholipase A2, group IVA (cytosolic, 


101720 


M69043 


M69043 Hs.81328 


nuclear factor of kappa light polypeptid 


101741 


M74719 


NM 003199HS.326198 


transcription factor 4 


101744 


M75126 


AI879352 Hs.118625 


hexokinase 1 


101793 


M84349 


W01076 Hs.278573 


CD59 antigen p18-20 (antigen identified 


101837 


M92843 


M92843 Hs.343586 


zinc finger protein homologous to Zfp-36 


101838 


M92934 


BE243845 Hs.75511 


connective tissue growth factor 


101840 


M93056 


AA236291 Hs.183583 


serine (or cysteine) proteinase inhibito 


101857 


M94856 


BE550723 Hs.153179 


fatty acid binding protein 5 (psoriasis- 


101864 


M95787 


BE392588 Hs.75777 


transgelin 


101931 


S76965 


NM_006823Hs.75209 


protein kinase (cAMP-dependent, catalyti 


101966 


S81914 


X96438 Hs.76095 


immediate early response 3 


102012 


U03057 


BE259035 Hs.1 18400 


singed (Drosophila)-like (sea urchin fas 


102013 


U03100 


BE616287 Hs.178452 


catenin (cadherin-associated protein), a 


102024 


U03877 


AA301867 Hs.76224 


EGF-containing fibulin-like extracellula 


102059 


U08021 


AI752666 Hs.76669 


nicotinamide N-methyltransferase 


102121 


U14391 


NM 004998HS.82251 


myosin IE 


102283 


U31384 


AW161552 Hs.83381 


guanine nucleotide binding protein 11 


102300 


U32S44 


AI929721 Hs.5120 


dynein, cytoplasmic, light polypeptide 


102378 


U40369 


AU076887 Hs.28491 


spermidine/spermine N 1 -acetyltransferase 


102395 


U41767 


AU077005 Hs.92208 


a disintegrin and metalloproteinase doma 


102460 


U48959 


U48959 Hs.211582 


myosin, light polypeptide kinase 


102491 


U51010 


U51010 


gb:Human nicotinamide N-methyltransferas 


102499 


U51478 


BE243877 Hs.76941 


ATPase, Na+/K+ transporting, beta 3 poly 


102523 


U53445 


U53445 Hs.15432 


downregulatad in ovarian cancer 1 


102560 


U59289 


R97457 Hs.63984 


cadherin 13, H-cadherin (heart) 


102564 


U59423 


U59423 Hs.79067 


MAD (mothers against decapentaplegic, Dr 


102589 


U62015 


AU076728 Hs.8867 


cysteine-rich, angiogenic inducer, 61 


102600 


U63825 


AI984144 Hs.66713 


hepatitis delta antigen-interacting prot 


102645 


U67963 


AL1 19566 Hs.6721 


lysosomal 


102687 


U73379 


NM_007019Hs.93002 


ubiquitin carrier protein E2-C 


102693 


U73824 


AA532780 Hs.1 83684 


eukaryotic translation initiation factor 


102709 


U77604 


AA122237 Hs.81874 


microsomal glutathione S-transferase 2 


102759 


U81607 


NM 005100HS.788 


A kinase (PRKA) anchor protein (gravin) 


102804 


U89942 


NM_002318Hs.83354 


lysyl oxidase-like 2 


102882 


X04412 


AI767736 Hs.290070 


gelsolin (amyloidosis, Finnish type) 


102907 


X06985 


BE409861 Hs.202833 


heme oxygenase (decycling) 1 




X07820 


X07820 Hs.2258 


matrix metalloproteinase 10 (siromelysin 


102927 


X12876 


BE512730 Hs.65114 


keratin 18 


102960 


X15729 


AI904738 Hs.76053 


DEAD/H (Asp-Glu-Aia-Asp/His) box polypep 


103011 


X52541 


AJ243425 Hs.326035 


early growth response 1 


103020 


X53416 


X53416 Hs.1 95464 


filamin A, alpha (actin-binding protein- 


103029 


X54489 


AW80D726 Hs.789 


GR01 oncogene (melanoma growth stimulati 


103036 


X54925 


M13509 Hs.83169 


matrix metalloproteinase 1 (interstitial 


103056 


X57206 


Y18024 Hs.78877 


inositol 1,4,5-trisphosphate 3-kinase B 
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103080 


X59798 


AU077231 Hs.82932 


cyclin D1 (PRAD1: parathyroid adenomatos 


103095 


X60957 


NMJ05424HS.78824 


tyrosine kinase with immunoglobulin and 


103138 


X65965 


X65965 


gb:H.sapiens SOD-2 gene for manganese su 


103176 


X69111 


AL021154 Hs.76884 


inhibitor of DNA binding 3, dominant neg 


103195 


X70940 


AA351647 Hs.2642 


eukaryotio translation elongation factor 


103347 


X87838 


AU077309 Hs.171271 


catenin (cadherin-associated protein), b 


103371 


X91247 


X91247 Hs.13046 


thioredoxin reductase 1 


103432 


X97748 


X97748 


gb:H.sapiens PTX3 gene promotor region. 


103471 


Y00815 


Y00815 Hs.75216 


protein tyrosine phosphatase, receptor! 


103967 


M303711 


AL120051 Hs.144700 


ephrin-B1 
ESTs 


104447 


L44538 


AW204145 Hs.156044 


104764 


AA025351 


AI039243 Hs.278585 


ESTs 


104783 


AA027050 


AA533513 Hs.93659 


protein disulfide Isomerase related prot 


104798 


AA029462 


AW952619 Hs.17235 


Homo sapiens clone TCCCIA00176 mRNA sequ 


104865 


AA045136 


T79340 Hs.22575 


B-cell CLUIymphoma 6, member B (zinc fi 


104877 


AA047437 


AI138635 Hs.22968 


Homo sapiens clone IMAGE:451939, mRNA se 


104894 


AA054087 


AF065214 Hs.18858 


phospholipaseA2, group IVC (cytosolic, 


104952 


AA071089 


AW076098 Hs.345588 


desmoplakin (DPI, DPII) 


105113 


M1 56450 


AB037816 Hs.8982 


Homo sapiens, clone !MAGE:3506202, mRNA, 


105178 


AA1 87490 


AA313825 Hs.21941 


AD036 protein 


105196 


AA1 95031 


W84893 Hs.9305 


angiotensin receptor-like 1 


105215 


AA205724 


AA205759 Hs.10119 


hypothetical protein FLJ14957 


105263 


AA227926 


AW388633 Hs.6682 


solute carrier family 7, (cationic amino 


105271 


AA227986 


AA807881 Hs.25329 


ESTs 


105330 


AA234743 


AW338625 Hs.22120 


ESTs 


105461 


AA253216 


BE539071 Hs.69388 


hypothetical protein FLJ20505 


105492 


AA256210 


AI805717 Hs.289112 


CGI-43 protein 


105493 


AA256268 


AL047586 Hs.10283 


RNA binding motif protein 8B 


105594 


AA279397 


AB024334 Hs.25001 


tyrosine 3-monooxygenase/tryptophan 5-mo 


105727 


AA292379 


AL135159 Hs.20340 


KIAA1002 protein 


105732 


AA292717 


AW504170 Hs.274344 


hypothetical protein MGC12942 


105767 


AA346551 


AW370946 Hs.23457 


ESTs 


105882 


AA400292 


W46802 Hs.81988 


disabled (Drosophila) homolog 2 (mitogen 


105936 


M404338 


AI678765 Hs.21812 


ESTs 


106031 


AA412284 


X64116 Hs.171844 


Homo sapiens cDNA: FLJ2229B fis, clone H 


106124 


AA423987 


H93366 Hs.7567 


Homo sapiens cDNA: FLJ21962 fis, clone H 


106222 


AA428594 


AA356392 Hs.21321 


Homo sapiens clone FLB9213 PR02474 mRNA, 


106241 


AA430108 


BE019681 Hs.6019 


Homo sapiens cDNA: FLJ21288 fis, clone C 


106263 


AA431462 


W21493 Hs.28329 


hypothetical protein FLJ14005 


106264 


AA431470 


AL046859 Hs.3407 


protein kinase (cAMP<iependent, catalyti 


106366 


AA443756 


AA186715 Hs.336429 


RIKEN cDNA 9130422N19 gene 


106454 


AA449479 


NM 014038HS.5216 


HSPC028 protein 


106634 


AA459916 


W25491 Hs.288909 


hypothetical protein FLJ22471 


106724 


AA465226 


N48670 Hs.28631 


Homo sapiens cDNA: FLJ22141 fis, clone H 


106793 


AA478778 


H94997 Hs.16450 


ESTs 


106799 


AA479037 


BE313412 Hs.7961 


Homo sapiens clone 25012 mRNA sequence 


106842 


AA482597 


AF124251 Hs.26054 


novel SH2-containing protein 3 


106868 


AA487561 


BE185536 Hs.301183 


molecule possessing ankyrin repeats indu 


106890 


AA489245 


AA489245 Hs.88500 


mitogen-activated protein kinase 8 inter 


106961 


AA504110 


AW243614 Hs.18063 


Homo sapiens cDNA FLJ10768 fis, clone NT 


106974 


AA520989 


AI817130 Hs.9195 


Homo sapiens cDNA FLJ13698 fis, clone PL 


107030 


AA599434 


AL1 17424 Hs.25035 


chloride intracellular channel 4 


107061 


AA608649 


BE147611 Hs.6354 


stromal cell derived factor receptor 1 


107086 


AA609519 


NM 012331HS.26458 


methionine sulfoxide reductase A 


107216 


D51069 


D51069 Hs.211579 


melanoma cell adhesion molecule 


107385 


U97519 


NM 005397HS.16426 


podocalyxin-like 


107444 


W28391 


W28391 Hs.343258 


proliferation-associated 2G4, 38kD 


107985 


AA035638 


T400S4 Hs.71968 


Homo sapiens mRNA; cDNA DKFZp564F053 (fr 


108507 


AA083514 


AI554545 Hs.68301 


ESTs 


108695 


AA121315 


AB029000 Hs.70823 


KIAA1077 protein 


108931 


AA147186 


AA147186 


gb:zo38d01.s1 Stratagene endothelial eel 


109001 


AA156125 


AI056548 Hs.72116 


hypothetical protein FLJ20992 similar to 


109195 


AA1 88932 


AF047033 Hs.132904 


solute carrier family 4, sodium bicarbon 


109390 


AA219653 


AW007485 Hs.87125 


EH-domain containing 3 


109456 


AA232645 


AW956580 Hs.42699 


ESTs 


109737 


F10078 


AA055415 Hs.13233 


ESTs, Moderately similar to A47582 B-cel 


110411 


H48032 


AW001579 Hs.9645 


Homo sapiens mRNA for KIAA1741 protein, 


110660 


H82117 


AA782114 Hs.28043 


ESTs 






AA035211 Hs. 17404 




111018 


N54067 


AI287912 Hs.3628 


mitogen-activated protein kinase kinase 


111091 


N59858 


AA300067 Hs.33032 


hypothetical protein DKFZp434N185 


111356 


N90933 


BE301871 Hs.4867 


mannosyl (alpha-1,3-)-glycoprotein beta- 


111378 


N93764 


AW1 60993 Hs.326292 


hypothetical gene DKFZp434A1 1 14 


111741 


R26124 


AB020653 Hs.24024 


KIAA0846 protein 


111769 


R27957 


AW629414 Hs.24230 


ESTs 


112318 


R55470 


AW083384 Hs.11067 


ESTs, Highly similar to T46395 hypotheti 
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T16550 


AA307634 Hs.6650 


vacuolar protein sorting 45B (yeast homo 






AW194301 Hs.339283 


Human DNA sequence from clone RP1-187J1 1 






H83265 Hs.8881 


ESTs, Weakly similar to S41044 chromosom 




T88700 


BE178110 Hs.173374 


Homo sapiens cDNA FU10500 fis, clone NT 


113542 


T90527 


H43374 Hs.7890 


Homo sapiens mRNA for KIAA1671 protein, 




W42789 


AW880709 Hs.283683 


chromosome 8 open reading frame 4 


113847 


W60002 


NM_005032Hs.4114 


plastln 3 (T Isoform) 


113910 


W78175 


AA1 13262 Hs.17901 


Homo sapiens, clone IMAGE:3937015, mRNA, 




W84768 


W84768 


gb:zh53d03.s1 Soares_fetalJiver_spleen_ 




W94427 


AL035858 Hs.3807 


FXYD domain-containing ion transport reg 


115061 


AA253217 


AI751438 Hs.41271 


Homo sapiens mRNA full length insert cDN 




AA426573 


AA486620 Hs.41135 


endomucin-2 


115870 


M432374 


NM_005985Hs.48029 


snail 1 (drosophila homolog), zinc finge 


115964 


AA446622 


AA987568 Hs.74313 


KIAA1265 protein 




AA478771 


AI767947 Hs.50841 




116264 


AA482594 


D51174 Hs.272239 


lysosomal 


116314 


AA490588 


AI799104 Hs.178705 


Homo sapiens cDNA FU11333 fis, clone PL 


116589 


D59570 


AI557212 Hs.17132 


ESTs, Moderately similarto I54374 gene 


117023 


H88157 


AW070211 Hs.102415 


Homo sapiens mRNA; cDNA DKFZp586N0121 (f 


117112 


H94648 


AW969999 Hs.293658 


ESTs 


117156 


H97538 


W73853 


ESTs 


117176 


H98670 


H45100 Hs.49753 


uveal autoantigen with coiled coil domai 


117280 


N22107 


M18217 Hs.172129 


Homo sapiens cDNA: FLJ214C9 fis, clone C 


119559 


W38197 


W38197 


Empirically selected from AFFX single pr 


119866 


W80814 


AA496205 Hs.193700 


Homo sapiens mRNA; cDNA DKFZp586l0324 (f 


120655 


AA287347 


M305599 Hs.238205 


hypothetical protein PRO2013 


121314 


AA402799 


W07343 Hs.182538 


phospholipid scramble 4 


121335 


AA404418 


AA404418 


gb:zw37e02,s1 Soares_total_fetus_Nb2HF8_ 


121822 


AA425107 


AI743860 


metallothionein 1E (functional) 


121835 


AA425435 


AB033030 Hs.300670 


KIAA1204 protein 


122331 


AA442872 


AL133437 Hs.110771 


Homo sapiens cDNA: FLJ21904 fis, clone H 


122577 


AA452860 


AA829725 Hs.334437 


hypothetical protein MGC4248 


123160 


AA488687 


AA488687 Hs.284235 


ESTs, Weakly similarto I38022 hypotheti 


123486 


AA599674 


BE019072 Hs.334802 


Homo sapiens cDNA FLJ14680 fis, clone NT 


124059 


F13673 


BE387335 Hs.283713 


ESTs, Weakly similar to S64054 hypotheti 


124339 


H99093 


H99093 Hs.343411 


DEAD/H (-so 3lu Ah A p/l lis) ; olyi ep 


124358 


N22495 


AW070211 Hs.102415 


Homo sapiens mRNA; cDNA DKFZp586N0121 (f 


124364 


N23031 


AF265555 Hs.250646 


baculoviral IAP repeat-containing 6 


124726 


R15740 


NM_003654Hs.104576 


carbohydrate (keratan sulfate Gal-6) sul 


124763 


R39610 


BE410405 Hs.76288 


calpain 2, (m/ll) large subunit 




W45560 


AL137540 Hs.102541 




125304 


Z39833 


AL359573 Hs.124940 


GTP-binding protein 


125307 


Z40583 


AW580945 Hs.330466 


ESTs 


125329 


AA825437 


M825437 Hs.58875 


ESTs 


125598 


R66613 


T40064 Hs.71958 


Homo sapiens mRNA; cDNA DKFZp564F053 (fr 


125609 


AA868063 


AA868063 Hs.104576 


carbohydrate (keratan sulfate Gal-6) sul 




AA128075 


AA088767 Hs.83883 


transmembrane, prostate androgen induced 




N66570 


X69086 Hs.286161 


Homo sapiens cDNA FLJ13613 fis, clone PL 


127566 


AI051390 


AI051390 Hs.1 16731 


ESTs 


127619 


AA627122 


AA627122 Hs.163787 


ESTs 


128453 


X02761 


X02761 Hs.287820 


fibronectin 1 


128495 


AF010193 


NM_005904Hs.100602 


MAD (mothers against decapentaplegic, Dr 


128515 


AA149044 


BE395085 Hs.10086 


type I transmembrane protein Fn14 


128580 


U82108 


U82108 Hs.101813 


solute earner family 9 (sodium/hydrogen 


128623 


D78676 


BE076608 Hs.105509 


CTL2gene 


128642 


L35240 


Z28913 Hs.102948 


enigma (LIM domain protein) 


128669 


AA598737 


W28493 Hs.180414 


heat shock 70kD protein 8 


128903 


R69417 


AW150717 Hs.345728 


STAT induced STAT inhibitor 3 


128914 


AA232837 


AW867491 Hs.107125 


plasmalemma vesicle associated protein 


129087 


N72695 


AI348027 Hs. 108557 


hypothetical protein PP1057 


129188 


M30257 


NM_001078Hs.109225 


vascular cell adhesion molecule 1 


129226 


M96843 


BE222494 Hs.180919 


inhibitor of DNA binding 2, dominant neg 


129265 


X68277 


AA530892 Hs.171695 


dual specificity phosphatase 1 


129345 


AA292440 


R22497 Hs.110571 


growth arrest and DNA-damage-inducible, 


129468 


J03040 


AW410538 Hs.111779 


secreted protein, acidic, cysteine-rich 


129488 


AA228107 


AW966728 Hs.54642 


methionine adenosyltransferase II, beta 


129498 


AA449789 


AA449789 Hs.75511 


connective tissue growth factor 


129557 


W01367 


AL045404 Hs.46366 


KIAA0948 protein 




AA610116 


AA209534 Hs.284243 


tetraspan NET-6 protein 


129627 


AA258308 


T40064 Hs.71968 


Homo sapiens mRNA; cDNA DKFZp564F053 (fr 


129762 


AA460273 


AA453694 Hs.12372 


tripartite motif protein TRIM2 


129884 


AA286710 


AF055581 Hs.13131 


lysosomal 


130018 


T68B73 


AA353093 


metallothionein 1L 


130147 


D63476 


D63476 Hs.172813 


PAK-interacfing exchange factor beta 


130178 


M62403 


U20982 Hs.1516 


insulin-like growth factor-binding prate 


130282 


X55740 


BE245380 Hs.153952 


5' nucleotidase (CD73) 
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130431 


L10284 


AW505214 Hs.155560 


130495 


AA243278 


AW250380 Hs.109059 


130553 


AA430032 


AF062649 Hs.252587 


130638 


H16402 


AW021276 Hs.17121 


130639 


D59711 


AI557212 Hs.17132 


130657 


T94452 


AW337575 Hs.201591 


130686 


AA431571 


BE548267 Hs.337986 


130776 


R79356 


AF167706 Hs.19280 


130818 


AA280375 


AW190920 Hs.19928 


130840 


Z49269 


BE048821 Hs.20144 


130899 


Z41740 


AI077288 Hs.296323 


131002 


AA121543 


AL050295 Hs.22039 


131080 


J05008 


NM 001955HS.2271 


131084 


AA101878 


NIVL017413HS.303084 


131091 


T35341 


AJ271216 Hs.22680 


131107 


N87590 


BE620886 Hs.75354 


131182 


AA256153 


AI824144 Hs.23912 


131207 


W74533 


AF104266 Hs.24212 


131319 


U25997 


NM_003155Hs.25590 


131328 


V01512 


AW939251 Hs.25647 


131328 


V01512 


AW939251 Hs.25647 


131328 


V01512 


AW939251 Hs.25647 


131328 


V01512 


AW939251 Hs.25647 


131509 


X56681 


X56681 Hs.2780 


131555 


AA161292 


T473S4 Hs.278613 


131564 


AA491465 


T93500 Hs.28792 


131573 


AA046593 


AA040311 Hs.28959 


131692 


D50914 


BE559681 Hs.30736 


131756 


D45304 


AA443966 Hs.31595 


131859 


M90657 


AW960564 


131909 


W69127 


NM_016558Hs.274411 


131915 


AA316186 


AI161383 Hs.34549 


132046 


AA384503 


AI359214 Hs.179260 


132050 


AA1 36353 


AI267615 Hs.38022 


132151 


M044755 


BE379499 Hs.173705 


132164 


U84573 


AI752235 Hs.41270 


132187 


AA058911 


AA235709 Hs.4193 


132303 


AA620962 


BE177330 Hs.325093 


132314 


AA285290 


AF112222 Hs.323806 


132358 


X60486 


NM.003542HS.46423 


132398 


R31641 


AA876616 Hs.16979 


132421 


M489190 


AW1 63483 Hs.48320 


132490 


F13782 


NM_001290Hs.4980 


132520 


AA257993 


AA257992 Hs.50651 


132546 


M24283 


M24283 Hs.168383 


132610 


AA443114 


AA160511 Hs.5326 


132716 


T35289 


BE379595 Hs.283738 


132840 


N23817 


BE218319 Hs.5807 


132883 


M047151 


AA373314 Hs.5897 


132968 


N77151 


AF234532 Hs.61638 


132989 


AA480074 


AA480074 Hs.331328 


132999 


Y00787 


Y00787 Hs.624 


133071 


T99789 


BE384932 Hs.64313 


133076 


W84341 


AW946276 Hs.6441 


133099 


L09209 


W16518 Hs.279518 


133147 


D12763 


AA026533 Hs.66 


133149 


T16484 


AA370045 Hs.6607 


133161 


M253193 


AW021103 Hs.6631 


133200 


AA432248 


AB037715 Hs.183639 


133220 


X82200 


NM_006074Hs.318501 


133260 


M083572 


AA403045 Hs.6906 


133295 


L00352 


AI147861 Hs.213289 


133349 


N75791 


AW631255 Hs.8110 


133391 


X57579 


AW103364 Hs.727 


133398 


X02612 


NM_000499Hs.72912 


133436 


H44631 


BE294068 Hs.737 


133454 


AA090257 


BE547647 Hs.177781 


133478 


X83703 


X83703 Hs.31432 






BE619053 Hs.170001 


133510 


AA227913 


AW880841 Hs.96908 


133517 


X52947 


NM_000165Hs.74471 


133526 


M11313 


AU077051 Hs.74561 




L14837 


NM_003257Hs.74614 


133562 


M60721 


M60721 Hs.74870 


133584 


D90209 


D90209 Hs.181243 


133590 


T67986 


T70956 Hs.75106 



mitochondrial ribosomal protein L12 
pituitary tumor-transforming 1 
ESTs 

ESTs, Moderately similar to I54374ger,e 
ESTs 

Homo sapiens cDNA FLJ10934 fis, clone OV 

cysteine-rich motor neuron 1 

hypothetical protein SP329 

small inducible cytokine subfamily A (Cy 

serum/glucocorticoid regulated kinase 

KIAA0758 protein 

endothelin 1 

apelin; peptide ligandforAPJ receptor 



GCN1 (general control of amino-acid synt 



v-fos FBJ murine osteosarcoma viral onco 
v-fos FBJ murine osteosarcoma viral onco 
v-fos FBJ murine osteosarcoma viral onco 
v-fos FBJ murine osteosarcoma viral onco 
jun D proto-oncogene 
interferon, alpha-inducible protein 27 
Homo sapiens cDNA FLJ1 1041 fis, clone PL 
ESTs 

KIAA01 24 protein 
ESTs 

transmembrane 4 superfamily member 1 

SCAN domain-containing 1 

ESTs, Highly similar to S94541 1 clone 4 

chromosome 14 open readinq frame 4 

ESTs 

Homo sapiens cDNA: FLJ22050 fis, clone H 

procollagen-lysine, 2-oxogiutarate 5-dio 

DKFZP58601624 protein 

Homo sapiens cDNA: FLJ21210 fis, clone C 

pinin, desmosome associated protein 

H4 hlstone family, member G 

ESTs, Weakly similar to A43932 mucin 2 p 

double ring-finger protein, Dorfin 

LIM domain binding 2 

Janus kinase 1 (a protein tyrosine kinas 

intercellular adhesion molecule 1 (CD54) 

amino acid system N transporter 2; porcu 

casein kinase 1, alpha 1 

GTPaseRab14 

Homo sapiens mRNA; cDNA DKFZp5B6P1622 (f 
myosin X 

hypothetical protein FLJ1321 3 



ESTs, Weakly similar to AF257182 1 G-pro 

Homo sapiens mRNA; cDNA DKFZp586J021 (fr 

amyloid beta (A4) precursor-like protein 

interleukin 1 receptor-like 1 

AXIN1 up-regulated 

hypothetical protein FLJ20373 

hypothetical protein FLJ10210 

Homo sapiens mRNA full length insert cDN 

Homo sapiens cDNA: FLJ23197 fis, clone R 

low density lipoprotein receptor (famili 

L-3-hydroxyacyl-Coenzyme A dehydrogenase 

inhibin, beta A (activin A, activin ABa 

cytochrome P450, subfamily I (aromatic c 

immediate early protein 

hypothetical protein MGC531 8 

cardiac ankyrin repeat protein 

eukaryotic translation initiation factor 

p53-induced protein 

gap junction protein, alpha 1, 43kD (con 

alpha-2-macraglobulin 

tight junction protein 1 (zona occludens 

H2.0 (Drasophila)-lite homeo box 1 

activating transcription factor 4 (tax-r 

clusterin (complement lysis inhibitor, S 
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133617 AA148318 


BE244334 Hs.75249 


ADP-ribosylation factor-like 6 interacti 


133651 1)97105 


AI301740 Hs.173381 


dihydrapyrimidinase-like 2 


133671 T25747 


AW503116 Hs.301819 


zinc finger protein 146 


133678 K02574 


AW247252 


nucleoside phosphorylase 


133681 D78577 


AI352558 




133722 X53331 


AW969976 Hs.279009 


matrix Gla protein 


133730 S73591 


BE242779 Hs.179526 


upregulated by 1,25-dihydroxyvitamln D-3 


133750 X95735 


BE410769 Hs.75873 


zyxin 


133802 L16862 


AW239400 Hs.76297 


G protein-coupled receptor kinase 6 


133825 U44975 


BE616902 Hs.285313 


core promoter element binding protein 


133838 M97796 


BE222494 Hs.180919 


inhibitor of DNA binding 2, dominant neg 


133859 U86782 


U86782 Hs.178761 


26S proteasome-associated pad1 homolog 


133889 AA099391 


1)48959 Hs.211582 


myosin, light polypeptide kinase 


133960 M19267 


M19267 Hs.77899 


tropomyosin 1 (alpha) 


133975 D29992 


C18356 Hs.295944 


tissue factor pathway inhibitor 2 


133977 L19314 
134039 S78569 


AM 25839 Hs.250666 
NM_002290Hs.78672 


hairy (Drosophila)-homolog 
laminin, alpha 4 


134075 U28811 


NM_012201Hs.78979 


Golgi apparatus protein 1 


134081 L77886 


AL034349 Hs.79005 


protein tyrosine phosphatase, receptor t 


134164 C14407 


AW245540 Hs.79516 


brain abundant, membrane attached signal 


134203 M60278 


AA161219 Hs.799 


diphtheria toxin receptor (heparin-bindi 


134238 R81509 


AA102179 Hs.160726 


Homo sapiens cDNA FLJ11680 fis, clone HE 


134299 AA487558 


AW580939 Hs.97199 


complement component C1q receptor 


134332 D86962 
134339 AA478971 


D86962 Hs.81875 
R70429 Hs.81988 


growth factor receptor-bound protein 10 
disabled (Drosophila) homolog 2 (mitogen 


134343 D50683 


D50683 Hs.82028 


transforming growth factor, beta recepto 


134381 U56637 


AI557280 Hs.184270 


capping protein (actln filament) muscle 


134403 M61199 


M334551 


sperm specific antigen 2 


134416 M28882 


X68264 Hs.211579 


melanoma cell adhesion molecule 


134493 X15183 


M30627 Hs.289088 


heat shock 90kD protein 1, alpha 


134558 S53911 


NM_001773Hs.85289 


CD34 antigen 


134817 U20734 


AU076592 Hs.198951 


jun B proto-oncogene 


134983 D28235 


D28235 Hs.196384 


prostaglandin-endoperoxide synthase 2 (p 


134989 AA236324 


AW968058 Hs.92381 


nudix (nucleoside diphosphate linked moi 


135052 AA148923 


AL1 36653 Ha.93675 


decidual protein induced by progesterone 


135062 AA174183 


AK000967 Hs.93872 


KIAA1682 protein 


135069 AA456311 


AA876372 Hs.93931 


Homo sapiens mRNA; cDNA DKFZp667D095 (fr 


135071 L08069 


W27190 Hs.94 


DnaJ (Hsp40) homolog, subfamily A, membe 


135073 AA452000 


W55956 Hs.94030 


Homo sapiens mRNA; cDNA DKFZp586E1624 (f 


135170 AA282140 


T53169 Hs.9587 


Homo sapiens cDNA: FLJ22290 fis, clone H 


135196 J02854 


C03577 Hs.9615 


myosin regulatory light chain 2, smooth 


135348 AA442054 


U80983 Hs.268177 


phospholipase C, gamma 1 (formerly subty 


134404 AB000450 


AB000450 Hs.82771 


vaccinia related kinase 2 


439561 AB002380 


AF1 80681 Hs.6582 


Rho guanine exchange factor (GEF) 12 


100082 AB003103 


M1 30080 Hs.4295 


proteasome (prosome, macropain) 26S subu 


132817 AB004884 


N27852 Hs.57553 


tousled-like kinase 2 


130150 AF000573 


BE094848 Hs.15113 


homogentlsate 1 ,2-dtaxygenase (homogenti 


100104 AF008937 


AF008937 


syntaxin 16 


447973 AF009301 


AB011169 Hs.20141 


similartoS.cerevisiae SSM4 


332613 AF009368 


AF029674 Hs.173422 


KIAA1605 protein 


100113 D00591 


NM 001269HS.84746 


chromosome condensation 1 


133980 D00760 


AA294921 Hs.348024 


v-ral simian leukemia viral oncogene horn 


100129 D11139 


AA469369 Hs.5831 


tissue inhibitor of metalloproteinase 1 


100154 D14657 


H60720 Hs.81892 


KIAA0101 gene product 


100169 D14878 


AL037228 Hs.82043 


D123gene product 


129718 D17716 


NM_002410Hs.121502 


mannosyl (alpha-1 ,6-)-g lycoprotein beta- 


100190 D21090 


M91401 Hs.178558 


RAD23 (S. cerevisiae) homolog B 


134742 D26135 


NM_001346Hs.89462 


diacylglycerol kinase, gamma (90kD) 


100211 D26528 


D26528 Hs.123058 


DEAD/H (Asp-Glu-Ala-Asp/His) box polypep 


100238 D30742 


L24959 Hs.348 


calcium/calmodulin-dependent protein kin 


130283 D31762 


NM_012288Hs.153954 


TRAM-like protein 


134237 D31765 


D31765 Hs.170114 


KIAA0061 protein 


100248 D31888 


NM_015156Hs.7839B 


KIAA0071 protein 


100256 D38128 


D25418 Hs.393 


prostaglandin 12 (prostacyclin) receptor 


100262 D38500 


D38500 Hs.278468 


postmeiotic segregation increased 2-like 


134329 D38551 


N92036 Hs.81848 


RAD21 (S. pombe) homolog 


100281 D42087 


AF091035 Hs.184627 


KIAA01 18 protein 


100294 D49396 
100327 D55640 


AA331881 Hs.75454 


peroxiredoxin 3 


100335 D63391 


AW247529 Hs.6793 


platelet-activating factor acetylhydrola 


134495 D63477 


D63477 Hs.84087 


KIAA0143 protein 


100338 D63483 


D86864 Hs.57735 


acetyl LDL receptor; SREC 


135152 D64015 


M96954 Hs.182741 


TIA1 cytotoxic granule-associated RNA-bi 


134269 D79990 


NM_014737Hs.80905 


Ras association (RalGDS/AF-6) domain fam 


100372 D79997 


NM 014791 Hs.184339 


KIAA0175 gene product 


134304 D80010 


BE613486 Hs.81412 


lipin 1 
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4 D87012 
7 D87075 
3 D87432 
3 D87448 
3 D87845 

1 HG1098-HT1098 

2 HG2167-HT2237 

1 HG2415-HT2511 

2 HG2825-HT2949 

2 HG2887-HT3031 
9 HG4660-HT5073 

5 HG4704-HT5146 
5 HG884-HT884 

3 HG919-HT919 

4 J00212 
7 J04029 
9 J04031 
7 J04088 
3 J04543 
3 L06139 
3 L07540 



) M31158 

5 M31166 

5 M31210 

) M55420 



D84284 Hs.66052 
AW291587 Hs.82733 
D86978 Hs.84790 
D87012 Hs.194685 
AF164142 Hs.82042 
D87432 Hs.10315 
AA013051 Hs.91417 
NM_000437Hs.234392 
X70377 Hs.121489 
AA019521 Hs.301946 
NM_004091Hs.231444 
BE613608 Hs.1 42653 
AI368680 Hs.816 
AL039123 Hs.103042 
Hs.172816 
AF002225 Hs.1 80686 
AF1 28542 Hs.166846 
JQ0212 

J04029 Hs.99936 
AW067805 Hs.1 72665 
J04088 Hs.156346 
J04543 Hs.78637 
T29618 Hs.89640 
AA460085 Hs.1 71 075 
L08895 Hs.78995 
L11239 Hs.36993 
BE409525 Hs.902 
Z83689 Hs.1 14765 
AI984625 Hs.9884 
L14922 Hs.1 66563 
BE297635 Hs.3069 
NM_005308Hs.211569 
H87879 Hs.1 02267 
AF083892 Hs.75608 
C18356 Hs.295944 
NM_002419Hs.89449 
AA101043 Hs.151254 
W76332 Hs.79107 
BE313625 Hs.57435 
AF168418 Hs.116784 
BE535511 
L41607 Hs.934 
AW250122 Hs.154879 
AW675039 Hs.1227 
AW675039 Hs.1227 
AW005903 Hs.78501 
AA557660 Hs.76152 
BE267931 Hs.78996 
M21305 
M22092 

NM_000546Hs.1846 
NM_002884Hs.865 
NM_002890Hs.758 
ls.74502 



R84694 Hs.79194 

M535244 Hs.78305 

M29551 Hs.151531 

M29971 Hs.1384 
M30269 

M31158 Hs.77439 

M31166 Hs.2050 

BE246154 Hs.154210 

S55271 Hs.247930 

AW3829B7 Hs.88474 

AA393273 Hs.75133 

D90337 Hs.247916 



Hs.173125 
Hs.77813 
Hs.77813 
Hs.77813 
Hs.77813 
Hs.77813 
Hs.62354 



CD38 antigen (p45) 

nidogen 2 

KIAA0225 protein 

topoisomerase (DNA) III beta 

solute carrier family 23 (nucleobase tra 

solute carrier family 7 (cationic amino 

topoisomerase (DNA) II binding protein 

platelet-activating factor acetylhydrola 

cystatin D 

lysosomal 

Homo sapiens, Similar to hypothetical pr 
ret finger protein 

SRY (sex determining region Y)-box 2 
microtubule-associated protein 1B 
neuregulin 1 

ubiquitin protein ligase E3A (human papi 
polymerase (DNA directed), epsilon 
Empirically selected from AFFX single pr 
keratin 10 (epidermdytic hyperkeratosis 



topoisomerase (DNA) II alpha (170kD) 
annexinA7 

TEK tyrosine kinase, endothelial (venous 
replication factor C (activator 1) 5 (35 
MADS box transcription enhancer factor 2 
gastrulation brain homeobox 1 
neurofibromin 2 (bilateral acoustic neur 
myeloid/lymphoid or mixed-lineage leukem 
spindle pole body protein 
replication factor C (activator 1) 1 (14 
heat shock 70kD protein 9B (mortalin-2) 
G protein-coupled receptor kinase 5 
lysyl oxidase 

tight junction protein 2 {zona occludens 
tissue factor pathway inhibitor 2 
mitogen-activated protein kinase kinase 
kallikrein 7 (chymotryptic, stratum com 
mitogen-activated protein kinase 14 
solute earner family 11 (proton-coupled 
thyroid hormone receptor interactor 4 
transmembrane trafficking protein 
glucosaminyl (N-acetyl) transferase 2, 1 
DiGeorge syndrome critical region gene D 



aminolevulinate, delta-, dehydratase 
uroporphyrinogen decarboxylase 
decorin 

proliferating cell nuclear antigen 
gb:Human alpha satellite and satellite 3 
gb:Human neural cell adhesion molecule ( 
tumor protein p53 (Li-Fraumeni syndrome) 
F!AP1A, member of RAS oncogene family 
RAS p21 protein activator (GTPase activa 
chymotrypsinogen B1 
oyclin B1 

cAMP responsive element binding protein 
RAB2, member RAS oncogene family 
protein phosphatase 3 (formerly 2B), cat 
O-6-methylguanine-DNAmethyltransferase 
nidogen (enactin) 

protein kinase, cAMP-dependent, regulato 
pentaxin-related gene, rapidly induced 'b 
endothelial differentiation, sphingolipi 
Epsilon , IgE 

prostaglandin-endopsroxide synthase 1 (p 
transcription factor 5-like 1 (mitochoad 
natriuretic peptide precursor C 
phospholipase A2, group IVA (cytosolic, 
ubiquitin-conjugating enzyme E2A (RAD6 h 
peptidylprolyl isomerase F (cyclophilin 



sphingomyelin phosphodiesterase 1, 
sphingomyelin phosphodiesterase 1, 
sphingomyelin phosphodiesterase 1, 
sphingomyelin phosphodiesterase 1, 
cell division cycle 4-like 
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101812 


M86934 


BE439894 Hs.78991 


DNA segment, numerous copies, expressed 


101813 


M87338 


NM_00291 4Hs. 139226 


replication factor C (activator 1) 2 (40 


133396 


M96326 


M96326 Hs.72885 


azurocidin 1 (cationic antimicrobial pro 


428161 


M96954 


M96954 Hs.182741 


TIA1 cytotoxic granule-associated RNA-bi 


129026 


M98833 


AL120297 Hs.108043 


Friend leukemia virus integration 1 


101901 


S66793 


H38026 Hs.308 


arrestin 3, retinal (X-arrestin) 


134831 


S72370 


AA853479 Hs.89890 


pyruvate carboxylase 


134039 


S78569 


NM_C02290Hs,78672 


laminin, alpha 4 


442355 


S79873 


AA456539 Hs.8262 


lysosornal-associated membrane protein 2 


101975 


S83325 


AA079717 Hs.283664 


aspartate beta-hydroxylase 


101977 


S83364 


AF112213 Hs.184062 


putative Rab5-interacting protein 


101978 


S83365 


BE561610 Hs.5809 


putative transmembrane protein; homolog 


101998 


U01212 


U01212 Hs.248153 


olfactory marker protein 


102003 


U01922 


U01922 Hs.125565 


translocase of inner mitochondrial membr 


102007 


U02556 


U02556 Hs.75307 


t-complex-associated-testis-expressed1- 


102009 


U02680 


BE245149 Hs.82643 


protein tyrosine kinase 9 


416658 


U03272 


U03272 Hs.79432 


fibrillin 2 (congenital contracture! ara 


132951 


U04209 


AW821182 Hs.61418 


microfibrillar-associated protein 1 


135389 


U05237 


U05237 Hs.99872 


fetal Alzheimer antigen 


102048 


U07225 


U07225 Hs.339 


purinergic receptor P2Y, G-protein coupl 


130145 


U07620 


U34820 Hs.151051 


mitogen-aotivated protein kinase 10 


303153 


U09759 


U09759 Hs.246857 


mitogen-activated protein kinase 9 


420269 


U09820 


U72937 Hs.96264 


alpha thalassemia/mental retardation syn 


102095 


U11313 


U11313 Hs.75760 


sterol carrier protein 2 


102123 


U14518 


NM_001809Hs,1594 


centromere protein A (17kD) 


102126 


U14575 


AW950870 Hs.78961 


protein phosphatase 1, regulatory (inhib 


102133 


U15173 


AU076845 Hs.155596 


BCL2/adenovirus E1B 19kD-interacting pro 


102139 


U15932 


NM_004419Hs,2128 


dual specificity phosphatase 5 


102162 


U18291 


AA450274 Hs.1592 


CDC16 (cell division cycle 16, S. cerevi 


102164 


U18300 


NM_000107Hs.77602 


damage-specific DNA binding protein 2 (4 


427653 


U18383 


AA159001 Hs.180069 


nuclear respiratory factor 1 


131817 


U20536 


U20536 Hs.3280 


caspase 6, apoptosis-related cysteine pr 


102200 


U21551 


AA232362 Hs.157205 


branched chain aminotransferase 1, cytos 


102210 


U23028 


BE619413 Hs.2437 


eukaryotic translation initiation factor 


102214 


U23752 


U23752 Hs.32964 


SRY (sex determining region Y)-box 1 1 


132811 


U25435 


U25435 Hs.57419 


CCCTC-binding factor (zinc linger protei 


131319 


U25997 


NM 003155HS.25590 


stanniocalcin 1 


102256 


U28251 


U28251 Hs.53237 


ESTs, Highly similar to Z159_HUMAN ZINC 


132316 


U28831 


U28831 Hs.44566 


KIM1641 protein 


102269 


U30245 


U30245 


gb:Human myelomonocytlc specific prate n 


417526 


U32315 


AA568906 Hs.82240 


syntaxin 3A 


102293 


U32439 


AF090116 Hs.79348 


regulator of G-protein signalling 7 


102298 


U32849 


AA382169 Hs.54483 


N-myc(and STAT) interactor 


102325 


U35139 


AI815867 Hs.50130 


necdin (mouse) homolog 
eukaryotic translation initiation factor 


428734 


U36764 


BE303044 Hs.192023 


102361 


U39400 


AA223616 Hs.75859 


chromosome 11 open reading frame 4 


102367 


U39657 


U39656 Hs.118825 


mitogen-activated protein kinase kinase 


102388 


U41344 


AA362907 Hs.76494 


proline arginine-rich end leucine-rich r 


102394 


U41766 


NM_003816Hs.2442 


a disintegrin and metalloproteinase doma 


129829 


U41813 


AF010258 Hs.127428 


homeoboxA9 


102409 


U43286 


BE300330 Hs.1 18725 


selenophosphate synthetase 2 


133746 


U44378 


AW410035 Hs.75862 


MAD (mothers against decapentaplegic, Dr 


102423 


U44754 


Z47542 Hs.179312 


small nuc'ear RNA activating complex, po 


132828 


U47011 


AB014615 Hs.57710 


fibroblast growth factor 8 (androgen-ind 


132828 


U47011 


AB014615 Hs.57710 


fibroblast growth factor 8 (androgen-ind 


132828 


1)47011 


AB014615 Hs.57710 


fibroblast growth factor 8 (androgen-ind 


132828 


U47011 


AB014615 Hs.57710 


fibroblast growth factor 8 (androgen-ind 


425322 


U47077 


U63630 Hs.155637 


protein kinase, DNA-aotivated, catalytic 


102450 


U48251 


U48251 Hs.75871 


protein kinase C binding protein 1 


129350 


U50535 


U50535 Hs.1 10630 


Human BRCA2 region, mRNA sequence CG006 


102534 


U56833 


U95759 Hs.198307 


von Hippel-Lindau binding protein 1 


130457 


U58091 


AB014595 Hs.155976 


cullin 4B 


135065 


U58837 


AA019401 Hs.93909 


cyclic nucleotide gated channel beta 1 


102560 




R97457 Hs.63984 


cadherin 13, H-cadherin (heart) 


102567 


U59863 


U63830 Hs.146847 


TRAF family member-associated NFKB aotiv 


417173 


U67122 


U61397 Hs.81424 


ubiquitin-like 1 (sentrin) 


102638 


U67319 


U67319 Hs.9216 


caspase 7, apoptosis-related cysteine pr 


132733 


U68019 


AW081883 Hs.211578 


Homo sapiens cDNA: FLJ23037 fis, clone L 






U92649 Hs.64311 




102663 


U70322 


NM_002270Hs.168075 


karyopherin (importin) beta 2 


134660 


U73524 


U73524 Hs.87465 


ATP/GTP-binding protein 


102735 


U79267 


AF111106 Hs.3382 


protein phosphatase 4, regulatory subuni 


102741 


U79291 


AW959829 Hs.83572 


hypothetical protein MGC14433 


130564 


U82671 


U82671 Hs.36980 


melanoma antigen, family A, 2 


130564 


U82671 


U82671 Hs.36980 


melanoma antigen, family A, 2 


132164 


U84573 


AI752235 Hs.41270 


procollagen-lysine, 2-oxoglutarate 5-dio 
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102826 U91316 

102831 U91932 

102846 U96131 

129777 U97018 

134161 U97188 

134854 V00503 

429257 X04327 

413985 X06389 

419768 X07496 

102915 X07820 

134656 X14787 

413858 X15525 

102968 X16396 



130282 X55740 

134542 X57025 

128568 X60673 

128568 X60673 

103093 X60708 

413076 X62048 

129063 X63097 

424460 X63563 

411077 X64037 

103181 X69636 



D85390 Hs.5057 
NM_G07274Hs.8679 
AA262170 Hs.80917 



131486 X83107 
35 130729 X84194 
103334 X85753 
132645 X87870 
135094 X89066 

103352 X89398 
40 103352 X89398 

103353 X89399 
132173 X89426 
103371 X91247 
131584 X91648 

45 103376 X92098 
103378 X92110 
128510 X94703 
103410 X96506 
133490 X97230 

5 0 332689 X97230 
103438 X98263 
103440 X98296 
103452 X99584 
133536 Y00264 

5 5 420234 Y07566 
426502 Y07759 
134652 Y07827 

132083 Y07867 
103500 Y09443 

60 134389 Y09858 

132084 Y12394 
103540 Z11559 
133152 Z11695 
103548 Z15005 

65 103612 Z46261 

129092 AA011243 
103692 AA018418 
103695 AA018758 
129796 AA018804 

70 434993 AA031993 
132683 AA044217 
131887 AA046548 
103723 AA057447 
453368 AA058376 

75 133260 AA083572 



U97018 Hs.12451 
M634543 Hs.79440 
J03464 Hs.179573 
AW163799 Hs.198365 
AI018666 Hs.75667 
T72104 Hs.93194 
X07820 Hs.2258 
AI750878 Hs.87409 
NM_001610Hs.75589 
AU076511 Hs.154672 
X16609 Hs.183805 
AI808780 Hs.227730 
AI808780 Hs.227730 
AW500470 Hs.1 17950 
BE018302 Hs.2894 



carboxypeptidase D 

cytosolic acyl coenzyme A thioester hydr 
adaptor-related protein complex 3, sigma 
thyroid hormone receptor interactor 13 
eohinoderm microtubule-associated protei 
lGF-ll mRNA-binding protein 3 
collagen, type I, alpha 2 



H12912 
S79876 
U10564 



AW977263 Hs.68257 
X69636 Hs.334731 
U43143 Hs.74049 



AW411340 Hs.31314 

BE242144 Hs.12013 

F06972 Hs.27372 

AI963747 Hs, 18573 
NM_001260Hs,25283 

AI654712 Hs,54424 
NM_003304Hs.250687 

H09366 Hs,78853 

H09366 Hs.78853 

X89399 Hs.119274 

X89426 Hs.41716 

X91247 Hs.13046 

AA598509 Hs.29117 

AL036166 Hs.323378 

AL1 19690 Hs.153618 
X94703 

M1 58294 Hs.295362 

AF022044 Hs.274601 

AF022044 Hs.274601 

AW175781 Hs.152720 

X98296 Hs,77578 



W25797.comp 
AW404908 Hs.96038 
Y07759 Hs.170157 
NM_007048Hs.284283 
BE386490 Hs.279663 
AW408009 Hs.22580 
Y09858 Hs.82577 
NM_002267Hs.3886 
NM_002197Hs.154721 
Z11695 Hs.324473 
Z15005 Hs.75573 



D56365 Hs.63525 
AW137912 Hs.227583 
' AW207152 Hs.186600 
BE218319 Hs.5807 
AA306325 Hs.4311 



W17064 Hs.332848 

BE274312 Hs.214783 

W20296 Hs.288178 

AA403045 Hs.6906 



synaptophysin 
apolipoprotein A-l 

matrix metalloprotelnase 10 (stromelysin 

thrombospondin 1 

acid phosphatase 2, lysosomal 

methylene tetrahydrofolate dehydrogenase 

ankyrin 1 , erythrocytic 

integrin, alpha 6 

integrin, alpha 6 

multifunctional polypeptide similar to S 
placental growth factor, vascular endoth 



insulin-like growth factor 1 (somatomedi 
adenylate kinase 3 
adenylate kinase 3 



weel (S. pombe) homolco. 

Rhesus blood group, D antigen 

polymerase (RNA) II (DNA directed) polyp 

general transcription factor IIF, polype 

Homo sapiens, clone IMAGE:3448306, mRNA, 

fms-related tyrosine kinase 4 

DEAD/H (Asp-Glu-Ala-Asp/His) box polypep 

refinoblastoma-binding protein 7 

ATP-binding cassette, sub-family E (OABP 

BMX non-receptor tyrosine kinase 

acylphosphatase 1, erythrocyte (common) 

eyelid-dependent kinase 8 

hepatocyte nuclear factor 4, alpha 

transient receptor potential channel 1 

uracil-DNAglycosylase 

uracil-DNAglycosylase 

RAS p21 protein activator (GTPase activa 

endothelial cell-specific molecule 1 



purine-rich element binding protein A 
coated vesicle membrane protein 
HCGVIII-1 protein 

RAB28, member RAS oncogene family 

DR1-associated protein 1 (negative cofac 

killer cell immunoglobulin-like receptor 

killer cell immunoglobulin-like receptor 

M-phase phosphoprotein 6 

ubiquitin specific protease 9, X chromos 

SMT3 (suppressor of mif two 3, yeast) ho 

Hs.177485 amyloid beta (A4) precursor protein (pro 

Ric (Drosophila)-like, expressed in many 

myosin VA (heavy polypeptide 12, myoxin) 

butyrophilin, subfamily 3, member A1 

Pirin 

alkylglycerone phosphate synthase 
splndlin-like 

karyopherin alpha 3 (importin alpha 4) 
aconitase 1, soluble 
mitogen-activated protein kinase 1 
centromere protein E(312kD) 
H3 histone family, member A 
poly(rC.)-binding protein 2 

iXmapXp11.23L- 



GTPase Rab14 

SUMO-1 activating enzyme subunit 2 
WD repeat domain 4 
SWI/SNF related, matrix associated, acti 
Homo sapiens cDNA FLJ14041 fis, clone HE 
Homo sapiens cDNA FLJ1 1968 fis, clone HE 
Homo sapiens cDNA: FU23197 fis, clone R 
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103766 M088744 


AI920783 


Hs.191435 




103767 AA089688 


BE244667 




CGI-100 protein 


132051 AA091284 


AA393968 


Hs.180145 


HSPC030 protein 


103773 AA092700 


AI219323 


Hs.101077 


ESTs, Weakly similar to T22363 hypotheti 


135289 AA092968 


AW372569 


Hs.9788 


hypothetical protein MGC10924 similar to 


409659 M094800 


AW970843 


Hs.55682 


eukaryotic translation initiation factor 


103794 AA100219 


AF244135 


Hs.30670 


hepatocellular carcinoma-associated anti 


131471 AA114885 


AA1 64842 


Hs.192619 


KIAA1600 protein 


134319 AA129547 


BE304999 


Hs.285754 


fumarate hydratase 


103807 AA133016 


AW958264 


Hs. 103832 


similar to yeast Upfl, variant B 


446392 AA149507 


AF142419 


Hs.15020 


homolog of mouse quaking QKI (KH domain 


129863 AA151005 


BE379765 


Hs.1 29872 


sperm associated antigen 9 


103850 AA187101 


AA187101 


Hs.213194 


hypothetical protein MGC10895 


103855 AA195179 


W02363 




hypothetical protein FU10330 


103861 AA206236 


AA206236 


Hs.4944 


hypothetical protein FLJ 12783 


130634 AA227621 


AI769067 


Hs.127824 


ESTs, Weakly similar to T28770 hypotheti 


447735 AA248283 


AA775268 




Homo sapiens cDNA: FLJ23020 fis. clone L 


103909 AA249S11 


M249611 


Hs.47438 


SH3 domain binding glutamic acid-rich pr 


458928 AA282640 


AF043117 


Hs.24594 


ubiquitination factor E4B (homologous to 


415824 AA287199 


D42039 


Hs.78871 


mesoderm development candidate 2 


129013 AA313990 


AA371156 


Hs.107942 


DKFZP564M112 protein 


129435 AA314256 


AF151852 


Hs.111449 


CGI-94 protein 


103988 AA314389 


AA314389 


Hs.342849 


ADP-ribosylation factor-like 5 


104000 AA324364 


AI146527 


Hs.80475 


polymerase (RNA) II (DNA directed) polyp 


425284 AA329211 


AF155568 


Hs.348043 


NS1 -associated protein 1 


128629 AA399187 


AL096748 


Hs. 102708 


DKFZP434A043 protein 


133281 AA421D79 


AK001601 


Hs.69594 


high-mobility group 20A 


104104 AA422029 


AA422029 


Hs.143640 


ESTs, Weakly similar to hyperpolarizatlo 


332455 AA425230 


NM_005754Hs.220689 


Ras-GTPase-activating protein SH3-domain 


132091 AA447052 


AW954243 




KIAA0251 protein 


135073 AA452000 


W55956 


Hs.94030 


Homo sapiens mRNA; cDNA DKFZp586E1624 (f 


131367 AA456687 


AI750575 


Hs.173933 


nuclear factor l/A 


129593 AA487015 


AI338247 


Hs.98314 


Homo sapiens mRNA; cDNA DKFZp586L0120 (f 


133505 C01527 


AI630124 


Hs.324504 


Homo sapiens mRNA; cDNA DKFZp586J0720 (f 


132064 C01714 


AA1 21098 


Hs.3838 


serum-inducible kinase 


442351 C01811 


W52642 


Hs.8261 


hypothetical protein FLJ22393 


131427 C02352 


AF1 51879 


Hs.26706 


CGI-121 protein 


433892 C02375 


AI929357 


Hs.323966 


Homo sapiens clone H63 unknown mRNA 


104282 C14448 


C14448 


Hs.332338 


EST 


134827 D16611 


BE314037 


Hs.89866 


coproporphyrinogen oxidase (coproporphyr 


425330 D25216 


D25216 


Hs.155650 


KIAA0014 gene product 


131742 D31352 


AA961420 


Hs.31433 


ESTs 


456935 D58024 


AA370362 


Hs.57958 


EGF-TM7-latrophilin-related protein 


425218 D80897 


NM_014909Hs.155182 


KIAA1036 protein 


104334 D82614 


D82614 


Hs.78771 


phosphoglycerate kinase 1 


134593 D87845 


NM_000437Hs.234392 


platelet-activating factor acetylhydrola 


134731 D89377 


D89377 


Hs.89404 


msh (Orosophila) homeobox homolog 2 


445776 H06583 


NM_001310Hs.13313 


cAMP responsive element binding protein- 


131670 H40732 


H03514 


Hs.15589 


ESTs 


104394 H46617 


AA1 29551 


Hs.172129 


Homo sapiens cDNA; FU21409 fis, clone C 


104402 H56731 


H56731 


Hs.132956 


ESTs 


439130 H75570 


AA306090 


Hs.1 24707 


ESTs 


129077 H78885 


N74724 


Hs.108479 


ESTs 


104417 H81241 


AI819448 


Hs.320861 


Kruppel-like factor 8 


134927 L36531 


L36531 


Hs.91296 


integrin, alpha 8 


129280 M63154 


M63154 


Hs.110014 


gastric intrinsic factor (vitamin B synt 


134498 M63180 


AW246273 Hs.84131 


threonyl-tRNA synthetase 


104460 M91504 


AW955705 


Hs.62604 


Homo sapiens, clone IMAGE:4299322, mRNA, 


104488 N56191 


N56191 


Hs.1 06511 


protocadherin 17 


131248 N78483 


AI038989 


Hs.332633 


Bardet-Biedl syndrome 2 


130017 R14652 


AK000096 


Hs.143198 


inhibitor of growth family, member 3 


104530 R20459 


AK001676 


Hs.1 2457 


hypothetical protein FLJ10814 


104534 R22303 


R22303 




gb;yh26b09.r1 Soares placenta Nb2HP Homo 


104544 R33779 


AI091173 


Hs.222362 


ESTs, Weakly similar to p40 [H.sapiens] 


133328 R36553 


AW452738 


Hs.265327 


hypothetical protein DKFZp761 1141 


104567 R64534 


AA040620 


Hs.5672 


hypothetical protein AF140225 


129575 R70621 


F08282 


Hs.278428 


progestin induced protein 


130776 R79356 


AF1 67706 


Hs.1 9280 


cysteine-rich motor neuron 1 


104599 R84933 


AW815036 


Hs.1 51 251 




104660 AA007160 


BE298665 


Hs.1 4846 


Homo sapiens mRNA; cDNA DKFZp564D016 (fr 


104667 AA007234 


AI239923 


Hs.63931 


ESTs 


104718 M018409 


AI143020 


Hs.36250 


ESTs, Weakly similar to I38022 hypotheti 


104764 AA025351 


AI039243 


Hs.278585 


ESTs 


104786 M027168 


AA027167 


Hs.10031 


KIAA0955 protein 


104787 AA027317 


AA027317 




gb:ze97d1 1 .s1 Soares_fetal_hearLNbHH19W 


134079 AA029423 


AK001751 


Hs.171835 


hypothetical protein FLJ10889 
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104804 


AA031357 


A1858702 


Hs.31803 


104865 


AA045136 


T79340 


Hs.22575 


130828 


AA053400 


AW631469 


Hs.203213 


104907 


AA055829 


AA055829 


Hs.196701 


104943 


AA065217 


AF072873 


Hs.1 14218 


105013 


AA1 16054 


H63789 


Hs.296283 


105024 


AA1 26311 


AA126311 


Hs.9879 


132592 


AA1 29390 


AW803564 


Hs.288850 


105038 


AA1 30273 


AW503733 


Hs.9414 


105077 


AA142919 


W55946 


Hs.234863 


105096 


AA1 50205 


AL042506 


Hs.21599 


129215 


AA1 76867 


AB040930 


Hs.126085 


105169 


AA180321 


BE245294 


Hs.180789 


132796 


AA1 80487 


NM.00628I 


3HS.173159 


427210 


AA1 87634 


BE396283 


Hs.173987 


105200 


AA1 95399 


AA328102 Hs.24641 


130114 


AA234717 


AA233393 


Hs.14992 


105330 


AA234743 


AW338625 


Hs.22120 


105337 


AA234957 


AI468789 


Hs.347187 


422040 


AA235604 


AA172106 Hs.1 10950 


105376 


AA236559 


AW994032 


Hs.8768 


105397 


AA242868 


AA814807 


Hs.7395 


431679 


AA251776 


AK000046 


Hs.343877 


131991 


AA251909 


AF053306 


Hs.36708 


421305 


AA252672 


BE397354 


Hs.324830 


105489 


AA256157 


AA256157 


Hs.24115 


105508 


AA256680 


AA173942 


Hs.326416 


105539 


AA258873 


AB040884 


Hs.109694 


135172 


AA262727 




Hs.12144 


131569 


AA281451 


AL389951 


Hs.271623 


431129 


AA281545 


AL137751 


Hs.263671 


105643 


AA282069 


BE621719 


Hs.173802 


105659 


AA283044 


AA283044 


Hs.25625 


105666 


M283930 


AA426234 


Hs.34906 


105674 


AA284755 


AI609530 


Hs.279789 


105709 


AA291268 




Hs.26761 


105722 


AA291927 


AI922821 


Hs.32433 


105765 


AA343514 


AA299688 


Hs.24183 


115951 


AA398109 


BE546245 


Hs.301048 


130884 


AA398109 


BE546245 


Hs.301048 


105962 


AA405737 


AW880358 


Hs.339806 


105985 


AA406610 


M406610 




106008 


AA411465 


AB033888 


Hs.8619 


457322 


AA416886 


AI815486 


Hs.243901 


134222 


AA424013 


AW855861 


Hs.8025 


446954 


AA424148 


AB037850 


Hs.16621 


106141 


AA424558 


AF031463 


Hs.9302 


447973 


AA424961 


AB011169 Hs.20141 


106157 


AA425367 


W37943 


Hs.34892 


428314 


AA425921 


AW1 35049 Hs.26285 


446727 


AA426220 


AB011095 


Hs.16032 


106196 


AA427735 


AA525993 


Hs.173699 


457714 


AA430673 


AA083764 




133200 


M432248 


AB037715 


Hs.1 83639 


106302 


AA435896 


AA398859 


Hs.18397 


106328 


AA436705 


AL079559 


Hs.28020 


450534 


AA446561 


AI570189 


Hs.25132 


106423 


AA448238 


AB020722 Hs.16714 


439608 


AA449756 


AW864696 


Hs.301732 


106477 


AA450303 


R23324 


Hs.41693 


106503 


AA452411 


AB033042 


Hs.29679 


446999 


AA454566 


M151520 




106543 


AA454667 


AA676939 


Hs.69285 


442007 


AA456437 


AA301116 


Hs.142838 


106589 


M456646 


AK000933 


Hs.28661 


106593 


AA456826 


AW296451 Hs.24605 


106596 


AA456981 


AA452379 




423064 


AA458959 


AF265208 


Hs.8740 




AA459950 


AW958037 Hs.286 


106654 


AA460449 


AW075485 Hs.286049 


131353 


M463910 


AW754182 




106707 


AA464603 


AK000566 Hs.98135 


452909 


AA464606 


NM_015368Hs.30985 


106717 


AA465093 


M600357 Hs.239489 


453141 


AA465692 


AB014548 Hs.31921 


106747 


M476473 


NM_007118Hs.171957 



ESTs, Weakly similar to N-WASP [H.sapien 
B-cell CLUIymphoma 6, member B (zinc fi 
ESTs 

ESTs, Weakly similar to ALU1_HUMAN ALU S 

frizzled (Drosophila) homolog 6 

ESTs, Weakly similar to KIAA0638 protein 

ESTs 

Homo sapiens cDNA: FLJ22528 fis, clone H 
KIAA1488 protein 

Homo sapiens cDNA FLJ12082 fis, clone HE 
Kruppel-like factor 7 (ubiquitous) 
KIAA1497 protein 
S164 protein 

transforming, acidic coiled-coil contain 
eukaryotic translation initiation factor 



hypothetical protein FLJ11151 
ESTs 

myotubularin related protein 1 
Rag C protein 

hypothetical protein FU 10849 
hypothetical protein FLJ23182 
hypothetical protein FLJ20039 
budding uninhibited by bsnzimidazoles 1 
diptheria toxin resistance protein requi 
Hcmo sapiens cDNA FLJ14178 fis, clone NT 
Homo sapiens mRNA; cDNA DKFZp564H1916 (f 
KIAA1451 protein 
KIAA1033 protein 
nucleoporin 50kD 

Homo sapiens mRNA; cDNA DKFZp434l0812 (f 

KIAA0603 gene product 

hypothetical protein FU11323 

ESTs, Weakly similar to T17210 hypotheti 

histone deacetylase 3 

DKFZP586L0724 protein 

ESTs 

ESTs 

sec13-like protein 

sec13-like protein 

hypothetical protein FLJ10120 

gb:zv15b10.s1 Soares_NhHMPu_S1 Homosapi 

SRY (sex determining region Y)-box 18 

Homo sapiens cDNA FLJ20738 fis, clone HE 

Homo sapiens clone 23767 and 23782 mRNA 

DKFZP434I1 16 protein 

phosducin-like 

similartoS.cerevisiae SSM4 
KIAA1323 protein 

Homo sapiens cDNA FU10643 fis, clone NT 
KIAA0523 protein 

ESTs, Weakly similar to ALU1.HUMAN ALU S 
hypothetical protein MGC3178 
hypothetical protein FLJ10210 
hypothetical protein FLJ23221 
KIAA0766 gene product 
KIAA0470 gene product 
Rho guanine exchange factor (GEF) 1 5 
hypothetical protein MGC5306 
DnaJ (Hsp40) homolog, subfamily B, membe 
cofactor required forSpl transcri| " 
hypothetical protein MGC4485 
neuropilin 1 
nucleolar phosphoprotein Nopp34 
Homo sapiens cDNA FLJ10071 fis, clone HE 
ESTs 

ESTs, Moderately similar to ALU7JHUMAN A 
SWI/SNF related, matrix associated, act! 
ribosomal protein L4 



gb:RC2-CT0321-131199-011-c01 CT0321 Homo 
hypothetical protein FU20559 
pannexin 1 

TIA1 cytotoxic granule-associated RNA-bi 
KIAA0S48 protein 

triple functional domain (PTPRF interact 
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106773 AA478109 


AA478109 


Hs.1 88833 




106781 AA478474 


AA330310 


Hs.24181 


ESTs 


106817 AA480889 


D61216 


Hs.18672 


ESTs 


106846 AA485223 


AB037744 


Hs.34892 


KIAA1323 protein 


106848 AA485254 


AA449014 


Hs.121025 


chromosome 11 open reading frame 5 


106856 AA486183 


W58353 


Hs.285123 


Homo sapiens mRNA full length insert cDN 


418699 AA496936 


BE539639 


Hs.173030 


ESTs, Weakly similar to ALU8_HUMAN ALU S 


107001 AA598589 


AI926520 


Hs.31016 


putative DNA binding protein 


442853 AA593831 


AW021276 Hs.17121 


ESTs 


107054 AA600150 


A1076459 


Hs.15978 


KIAA1272 protein 


107059 AA608545 


BE614410 


Hs.23044 


RAD51 (S, cerevisiae) homolog (E coli Re 


107080 AA609210 


AL1 22043 


Hs.19221 


hypothetical protein DKFZp566G1424 


107115 AA610108 


BE379623 


Hs.27693 


peptidylprolyl isomerase (cycbphilln)-l 


107130 AA620582 


AB033106 


Hs.12913 


KIAA1280 protein 


107156 AA621239 


AA137043 


Hs.9663 


programmed cell death 6-interacting prot 


107174 AA621714 


BE1 22762 


Hs.25338 


ESTs 


130621 AA621718 


AW513087 Hs.16803 


LUC7 (S. cerevislae)-like 


107190 D19573 


AA836401 


Hs.87860 


ESTs 


132626 D25755 


AW504732 Hs.21275 


hypothetical protein FLJ11011 


107217 D51095 


AL080235 


Hs.35861 


DKFZP586E1621 protein 


332584 D60272 


AA357879 


Hs.29423 


ESTs; Weakly similar to macrophage lecti 


444655 T08879 


AF088886 


Hs.11590 


cathepsin F 


107295 T34527 


M1 86629 


Hs.80120 


UDP-N-acetyl-alptia-D-galactosamineipolyp 


107299 T40327 


BE277457 


Hs.30661 


hypothetical protein MGC46D5 


107315 T62771 


M316241 


Hs.90691 


nucleophosmin/nucleoplasmin 3 


107316 T63174 


T63174 


Hs.1 93700 


Homo sapiens mRNA; cDNA DKFZp586l0324 (f 


107328 T83444 


AW959891 


Hs.76591 


KIAA0887 protein 


107334 T93641 


T93597 


Hs.1 87429 


ESTs 


456340 U48263 


U48263 


Hs.89040 


prepronociceptin 


128636 U49065 


U49065 


Hs.1 02865 


interleukin 1 receptor-like 2 


129938 U79300 


AW003668 


Hs.135587 


Human clone 23629 mRNA sequence 


107375 U88573 


BE011845 


Hs.251064 


high-mobility group (nonhistone chromoso 


130074 U93867 


AL038596 


Hs.250745 


polymerase (RNA) III (DNA directed) (62k 


107387 W01094 


D86983 


Hs.1 18893 


Melanoma associated gene 


132036 W01568 


AL1 57433 


Hs.37706 


hypothetical protein DKFZp434E2220 


107426 W26853 


W26853 


Hs.291003 


hypothetical protein MGC4707 


135388 W27965 


W27965 


Hs.99865 


epimorphin 


130419 W36280 


AF037448 


Hs.155489 


NS1-assoc:ated protein 1 


107469 W47C63 


W47063 


Hs.94668 


ESTs 


434203 W7906O 


BE262677 


Hs.283558 


hypothetical protein PR01855 


107506 W88550 


AB028981 


Hs.8021 


KIAA1058 protein 


132358 X60486 


NM_003542Hs.46423 


H4 histone family, member G 


107522 X78931 


X78931 


Hs.99971 


zinc finger protein 272 


456495 Z14077 


NMJ03403HS.97496 


YY1 transcription factor 


107582 AA002147 


M002147 


Hs.59952 


EST 


107609 AA004711 


R75654 


Hs.164797 


hypothetical protein FLJ13593 


107661 AA010383 


AA010383 


Hs.60389 


ESTs 


107714 AA015761 


AA015761 


Hs.60642 


ESTs 


107775 AA018772 


AW008846 


Hs.60857 


ESTs 


107832 AA021473 


AA021473 




gb:ze66d 1.s1 Soares retina N2b4HR Homo 


107859 AA024835 


AW732573 Hs.47584 


potassium voltage-gated channel, delayed 


107914 AA027229 


M027229 


Hs.61329 


ESTs, Weakly similar to T1 6370 hypotheti 


107935 AA029428 


AA029428 


Hs.61555 


ESTs 


410196 AA035143 


AI936442 


Hs.59838 


hypothetical protein FLJ10808 


131461 AA035237 


AA992841 


Hs.27263 


KIAA1458 protein 


108007 AA039347 


AA039347 


Hs.61916 


EST 


108029 AA040740 


AA040740 


Hs.62007 


ESTs 


108040 M041551 


AL121031 


Hs.159971 


SWI/SNF related, matrix associated, acti 


108084 AA045513 


AA058944 


Hs.1 16602 


Homo sapiens, clone IMAGE:4154008, mRNA, 


108088 AA045745 


AA045745 


Hs.62885 


ESTs 


108168 AA055348 


AI453137 


Hs.63176 


ESTs 


130719 AA056582 


AA679262 


Hs.14235 


hypothetical protein FLJ20008; KIAA1839 


108189 AA056697 


AW376061 


Hs.63335 


ESTs, Moderately similar to A46010 X-lin 


108190 AA056746 


AA056746 


Hs.63338 


EST 


108203 M057678 


AW847814 Hs.289005 


Homo sapiens cDNA; FLJ21532 fis, clone C 


108216 AA058681 


AA524743 


Hs.44883 


ESTs 


108217 AA058686 


AA058686 


Hs.62588 


ESTs 


108245 AA062840 


BE410285 


Hs.89545 


proteasome (prosome, macropain) subunit, 




AA064859 




gb:zm50f03.s1 Stratagene fibroblast (937 


108280 AA065069 


AA065069 




gb:zm12e11.s1 Stratagene pancreas (93720 


108309 M069923 


AA069818 




gb:zm67e03.r1 Stratagene neuroepithelium 


108340 M070815 


M059820 


Hs.180909 


peroxiredoxin 1 


108403 AA075374 


AA075374 




gb:zm87a01.s1 Stratagene ovarian cancer 


108427 M076382 


AA076382 




gb:zm91g08.s1 Stratagene ovarian cancer 


108435 AA078787 


T82427 


Hs.194101 


Homo sapiens cDNA: FLJ20869 fis, clone A 


108439 AA078986 


AA078986 




gb:zm92h01.s1 Stratagene ovarian cancer 
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108465 


AA079393 


AA079393 Hs.3462 


cytochrome c oxidase subunit Vile 


108469 


AA079487 


AA079487 


gb:zm97f08.s1 Stratagene colon HT29 (937 


108500 


AA083207 


AA083207 Hs.68270 


EST 


108501 


AA083256 


AA083255 


gb:zn08g12.s1 Stratagene hNT neuron (937 


108533 


AA084415 


AA084415 


gb:zn06g09.s1 Stratagene hNT neuron (937 
gb:zm26c06.s1 Stratagene pancreas (93720 


108562 


AA085274 


AA100795 


108589 


AA088678 


AI732404 Hs.68846 


ESTs 


130890 


AA1 00925 


AI907537 Hs.76698 




432645 


AA101255 


D14041 Hs.347340 


H-2K binding factor-2 


130385 


AA1 26474 


AW067800 Hs. 155223 


stanniocalcin 2 


108749 


AA127017 


AA127017 Hs.71052 


ESTs 


108807 


AA1 29968 


AI652236 Hs.49376 


hypothetical protein FLJ20644 


108808 


AA1 30240 


AA045088 Hs.62738 


ESTs 


108833 


AA131866 


AF188527 Hs.61661 


ESTs, Weakly similar to AF174605 1 F-box 


108846 


AA1 32983 


AL1 17452 Hs.44155 


DKFZP586G1 517 protein 


108857 


AA1 33250 


AK001463 Hs.62180 


anillin (Drosophila Scraps homolog), act 


131474 


AA1 33583 


L46353 Hs.2726 


high-mobility group (nonhistone chromcso 


108894 


AA1 35941 


AK001431 Hs.5105 


hypothetical protein FLJ10569 


108941 


AA148650 


AA148650 


gb:zo09eQ6.s1 Stratagene neuroepithelium 


108968 


AA151110 


AI304870 Hs.188680 


ESTs 


108996 


AA1 55754 


AW995610 Hs.332436 


EST 


109001 


AA156125 


AI056548 Hs.72116 


hypothetical protein FLJ20992 similar to 


131183 


AA1 56289 


AI611807 Hs.285107 


hypothetical protein FLJ13397 


109019 


AA1 56997 


AA156755 Hs.72150 


ESTs 


109022 


AA1 57291 


AA157291 Hs.21479 


ubinuclein 1 


109023 


AA1 57293 


AA157293 Hs.72168 


ESTs 


109068 


AA1 64293 


AA164293 Hs.72545 


ESTs 


109072 


AA1 64676 


AI732585 Hs.22394 


hypothetical protein FLJ 10893 


426981 


AA167375 


AL044675 Hs.173081 


KIAA0530 protein 


130346 


AA1 67550 


H05769 Hs.188757 


Homo sapiens, clone MGC:5564, mRNA, com 


109146 


AA176589 


AA176589 Hs.142078 


EST 


109172 


M1 80448 


AA180448 Hs.144300 


EST 


428438 


AA187144 


NM_001955Hs.2271 


endothelin 1 


129208 


AA189170 


AI587376 Hs.109441 


MSTP033 protein 


109222 


AA192757 


AA192833 Hs.333512 


similar to rat myomegalin 


109300 


AA205650 


AA418276 Hs.170142 


ESTs 


109481 


AA233342 


AA878923 Hs.289069 


hypothetical protein FLJ21016 


109485 


AA233472 


BE619092 Hs.28465 


Homo sapiens cDNA: FLJ21869 fis, clone H 


109516 


AA234110 


AI471639 Hs.71913 


ESTs 


109537 


D80981 


AI858695 Hs.34898 


ESTs 


109556 


F01660 


AI925294 Hs.87385 


ESTs 


109577 


F02206 


F02206 Hs.296639 


Homo sapiens potassium channel subunit ( 


109578 


F02208 


F02208 Hs.27214 


ESTs 


109595 


F02544 


AA078629 Hs.27301 


ESTs 


109625 


F03918 


H29490 Hs.22697 


ESTs 


428376 


F04258 


AF119665 Hs.184011 


pyrophosphatase (inorganic) 


109648 


F04600 


H17800 Hs.7154 


ESTs 


109671 


F08998 


R59210 Hs.26634 


ESTs 


109699 


F09605 


H18013 Hs.167483 


ESTs 


109820 


F11115 


AW016809 Hs.1 19021 


ESTs 


109933 


H06371 


R52417 Hs.20945 


Homo sapiens clone 24993 mRNA sequence 


110014 


H10995 


AL109666 Hs.7242 


Homo sapiens mRNA full length insert cDN 


110039 


H11938 


H11938 Hs.21907 


histone acetylfransferase 


110099 


H16568 


R44557 Hs.23748 


ESTs 


110107 


H16772 


AW151660 Hs.31444 


ESTs 


110155 


H18951 


AI559626 Hs.93522 


Homo sapiens mRNA for KIAA1647 protein, 


110197 


H20859 


AW090386 Hs.1 12278 




110223 


H23747 


H 19836 Hs.31697 


ESTs 


110306 


H38087 


H38087 Hs.105509 


CTL2 gene 


110335 


H40331 


H65490 Hs.18845 


ESTs 


110342 


H40567 


H40961 Hs.33008 


ESTs 


110395 


H46966 


AA025116 Hs.33333 


ESTs 


110511 


H56640 


H56640 Hs.221460 


ESTs 


110523 


H57154 


AI040384 Hs.19102 


ESTs, Weakly similar to organic anion tr 


110715 


H96712 


H96712 Hs.269029 


ESTs 


110754 


N20814 


AW302200 Hs.6336 


KIAA0672 gene product 


428454 


N25249 


U55936 Hs.184376 


synaptosomal-associated protein, 23kD 


431663 


N27100 


NM_016569Hs.267182 


TBX3-iso protein 




N39616 


AW973443 Hs.8086 


RNA (guanine-7 * methyltransferase 


110938 


N48982 


N48982 Hs!38034 


Homo sapiens cDNA FLJ12924 fis, clone NT 


110983 


N51957 


NM 015367HS.10267 


MIL1 protein 


111081 


N59435 


AI146349 Hs.271614 


CGI-112 protein 


111128 


N64139 


AW505364 Hs.19074 


LATS (large tumor suppressor, Drosophila 


431548 


N66981 


AI834273 Hs.9711 


novel protein 


111216 


N68640 


AW1 39408 Hs.1 52940 


ESTs 


437562 


N69352 


AB001638 Hs.5683 


DEAD/H (Asp-Glu-Ala-Asp/His) box polypep 
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111399 


R00138 


AW270776 Hs. 18857 


ESTs 


111514 


R07998 


R07998 




gb:yf16g1 1 .s1 Soares fetal liver spleen 


428744 


R08929 


BE267033 


Hs.192853 


ubiquitin-conjugating enzyme E2G 2 (homo 


111574 


R10307 


AI024145 


Hs.188526 


ESTs 


111804 


R33354 


AA482478 


Hs.181785 


ESTs 


111831 


R36083 


R36095 


Hs.268695 


ESTs 


426773 


R37938 


NM_015556Hs.172180 


KIAA0440 protein 


111904 


R39330 


Z41572 




gb:HSCZYB122 normalized infant brain cDN 


428371 


R40816 


AB012193 


Hs.183874 


cullin 4A 


112033 


R43162 


R49031 


Hs.22627 


ESTs 


130987 


R45698 


BE613269 


Hs.21893 


hypothetical protein DKFZp761N0624 


112300 


R54554 


H24334 


Hs.26125 


ESTs 


112513 


R68425 


R68425 


Hs.13809 


hypothetical protein FLJ10648 


112514 
112522 


R68568 
R68763 


R68568 
R68857 


Hs.1 83373 
Hs.265499 


src homology 3 domain-containing protein 
ESTs 


112540 


R70467 


R69751 




gb:yi40a10.s1 Soares placenta Nb2HP Homo 


428655 


R73565 


H05769 


Hs.188757 


Homo sapiens, clone MGC:5564, mRNA, comp 


129534 
112597 


R73640 
R78376 


AK002126 
R78376 


Hs.11260 
Hs.29733 


hypothetical protein FU11264 
EST 


112732 


R92453 


R92453 


Hs.34590 


ESTs 


451798 


T03865 


BE297567 


Hs.27047 


hypothetical protein FLJ20392 


112888 


T03872 


AW195317 


Hs.107716 


hypothetical protein FLJ22344 


131863 


T10072 


AI656378 


Hs.33461 




112911 


T10080 


AW732747 Hs.13493 


like mouse brain protein E46 


132215 


T10132 


AL035703 


Hs.4236 


KIAA0478 gene product 


112931 


T15343 


T02966 


Hs.167428 


ESTs 


112984 


T23457 


T16971 


Hs.289014 


ESTs, Weakly similar to A43932 mucin 2 p 


112998 


T23555 


H11257 


Hs.22968 


Homo sapiens clone IMAGE:451939, mRNA se 


133376 


T23670 


BE618768 


Hs.7232 


acetyl-Coenzyme A carboxylase alpha 


113026 


T23948 


AA376654 




eukanyotic translation initiation fecfor 


113070 


T33464 


AB032977 


Hs.6298 


KIAA1151 protein 


410781 


T34413 


AI375672 


Hs.165028 


ESTs 


113074 
113095 


T34611 
T40920 


AK001335 
AA828380 


Hs.31137 
Hs.126733 


protein tyrosine phosphatase, receptor t 
ESTs 


113179 


T55182 


BE622021 


Hs.152571 


ESTs, Highly similar to IGF-II mRNA-bind 


113337 


T77453 


T77453 


Hs.302234 


ESTs 


113421 


T84039 


AI769400 


Hs.189729 


ESTs 


113454 


T86458 


AI022166 


Hs.16188 


ESTs 


113481 

453345 


T87693 
T89350 


T87693 
AA302862 


Hs.204327 
Hs.90063 


EST 

neurocalcin delta 


113557 
113559 


T90945 
T90987 


H66470 
T79763 


Hs.16004 
Hs.14514 


ESTs 


113589 


T91863 


AI078554 


Hs.15682 


ESTs 


113591 


T91881 


T91881 


Hs.200597 


KIAA0563 gene product 


113619 
113683 


T93783 
T96687 


R08665 
AB035335 


Hs. 17244 
Hs.144519 


hypothetical protein FLJ 13605 
T-cell leukemia/iymphoma 5 


113692 


T96944 


AL360143 


Hs. 17936 


DKFZP434H132 protein 


113702 


T97307 


T97307 




gb:ye53h05.s1 Soares fetal liver spleen 


113717 


T97764 


T99513 


Hs.187447 


ESTs 


113824 


W48817 


AI631964 


Hs.34447 


ESTs 


113840 


W58343 


R72137 


Hs.7949 


DKFZP586B2420 protein 


113844 


W59949 


AI369275 


Hs.243010 


Homo sapiens cDNA FLJ14445 fis, clone HE 


113902 


W74644 


AA340111 


Hs.100009 


acyl-Coenzyme A oxidase 1, palmitoyl 


113904 


W74761 


AF125044 


Hs.19196 


ubiquitin-conjugating enzyme HBUCE1 


113905 


W74802 


R81733 


Hs.33106 


ESTs 


113931 


W81205 


BE255499 


Hs.3496 


hypothetical protein MGC15749 


113932 


W81237 


AA256444 


Hs. 126485 


hypothetical protein FLJ 12604; KIAA1592 


131965 


W90146 


W79283 


Hs.35962 


ESTs 


114035 


W92798 


W92798 


Hs.269181 


ESTs 


114106 


Z38412 


AW602528 




gb:RC5-BT0562-2601 00-0 1 1 -A02 BT0562 Homo 


457308 


Z38709 


AI416988 


Hs.238272 


inositol 1,4,5-Wphosphate receptor, ty 


114161 


Z38904 


BE548222 


Hs.299883 


hypothetical protein FLJ23399 


424949 


Z39103 


AF052212 


Hs. 153934 


core-binding factor, runt domain, alpha 


457548 


Z39930 


AW069534 Hs.279583 


CGI-81 protein 


128937 


Z39939 


AA251380 


Hs.10726 


ESTs, Weakly similar to ALU1_HUMAN ALU S 


432554 


Z40012 


AI479813 


Hs.278411 


NCK-associated protein 1 


114277 


Z40377 


AI052229 


Hs.25373 


ESTs, Weakly similarto T20410 hypotheti 


114304 


Z40820 


AI934204 


Hs.16129 


ESTs 






AL117427 


Hs.1 72778 


Homo sapiens mRNA; cDNA DKFZp566P013 (fr 


432620 


AA005112 


AA777749 


Hs.5978 


LIM domain only 7 


129034 


AA005432 


AA481157 


Hs.108110 


DKFZP547E21 10 protein 


131881 


AA010163 


AW361018 Hs.3383 


upstream regulatory element binding prot 


332421 


AA026356 


AI909968 


Hs.108106 


transcription factor 


114465 


AA026901 


BE621056 


Hs.131731 


hypothetical protein FLJ 11 099 


451271 


AA036867 


AK001644 


Hs.26156 


hypothetical protein FU 10782 


332498 


AA044644 


AA303661 




lymphocyte-specific protein 1 
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431555 


AA046426 


AI815470 Hs.260024 


Cdc42 effector protein 3 


132944 


AA054515 


T96641 Hs.6127 


Homo sapiens cDNA: FLJ23020 fis, clone L 


114618 


AA084162 


AW979261 Hs.291993 


ESTs 


332509 


AA085749 


AA128376 Hs.153884 


ATP binding protein associated with cell 


114648 


AA101056 


AA101056 


gb:zn25b03,s1 Stratagene neuroepithelium 


114658 


AA102746 


AA102383 Hs.249190 


tumor necrosis factor receptor superfami 


132456 


AA1 14250 


AB011084 Hs.48924 


KIAA0512 gene product; ALEX2 


450847 


AA1 26561 


NM 003155HS.25590 


stanniocalcin 1 


132225 


AA1 28980 


AA128980 


gb:zo09a11.s1 Stratagene neuroepithelium 


437197 


AA1 29757 


W38586 


guanine nucleotide binding protein (G pr 


114709 


AA129921 


AA397651 Hs.301959 


proline synthetase co-transcribed (bade 


456926 


AA133331 


AB018284 Hs.158688 


KIAA0741 gene product 


114750 


M135958 


AA887211 Hs.129467 


ESTs 


426806 


AA136524 


T19228 Hs.172572 


hypothetical protein FLJ20093 


114763 


AA147044 


AA810755 Hs.102500 


hypothetical protein dJ511E16.2 


114767 


AA148885 


AI859865 Hs.154443 


minichromosorne maintenance deficient (S. 


114774 


M1 50043 


AV656017 Hs.184325 


CGI-76 protein 


129388 


AA151621 


AA662477 Hs.110964 


hypothetical protein FLJ23471 


457742 


AA1 55743 


BE561824 Hs.273369 


uncharacterized hematopoietic stem/proge 


456200 


AA156335 


AA768242 Hs.80618 


hypothetical protein 


130207 


AA1 56336 


AF044209 Hs.144904 


nuclear receptor co-repressor 1 


114798 


AA159181 


AA159181 Hs.54900 


serologically defined colon cancer antig 


114800 


AA159825 


Z19448 Hs.131887 


ESTs, Weakly similar to T24396 hypotheti 


114828 


AA234185 


AA252937 Hs.283522 


Homo sapiens mRNA; cDNA DKFZp434J1912 (f 


114846 


AA234929 


BE018682 Hs.166196 


ATPase, Class I, type 8B, member 1 


114848 


AA234935 


BE614347 Hs.169615 


hypothetical protein FU20989 


114902 


AA236359 


AW275480 Hs.39504 


hypothetical protein MGC4308 


132271 


AA236466 


AB030034 Hs.1 15175 


sterile-alpha motif and leucine zipper c 


114907 


AA236535 


N29390 Hs. 13804 


hypothetical protein dJ462023.2 


420170 


AA236935 


U43374 Hs.95631 


Human normal keratinocyte mRNA 


132204 


AA236942 


AA235827 Hs.42265 


ESTs 


114928 


AA237018 


AA237018 Hs.94869 


ESTs 


132481 


AA237025 


W93378 Hs.49614 


ESTs 


114932 


AA242751 


AA971436 Hs.16218 


KIAA0903 protein 


314162 


AA242760 


BE041820 Hs.38516 


Homo sapiens, clone MGC:15887, mRNA, com 


131006 


AA242763 


AF064104 Hs.22116 


CDC14 (cell division cycle 14, S. cerevi 


114935 


AA242809 


H23329 Hs.290880 


ESTs, Weakly similar to ALU1JHUMAN ALU S 


408908 


AA243133 


BE296227 Hs.250822 


serine/threonine kinase 15 


437754 


AA243495 


R60366 Hs.5822 


Homo sapiens cDNA: FLJ22120 fis, clone H 


114957 


AA2437G6 


AW1 70425 Hs.87680 


ESTs 


114974 


AA250848 


AW966931 Hs.302649 


nucleosome assembly protein 1-like 1 


114977 


AA250868 


AW296978 Hs.87787 


ESTs 


114995 


AA251152 


AA769266 Hs. 193657 


ESTs 


115005 


AA251544 


AI760825 Hs.153042 


ESTs 


417177 


AA251792 


NM_004458Hs.81452 


fatty-acid-Coenzyme A ligase, long-chain 


115026 


AA252144 


AA251972 Hs.188718 


ESTs 


115045 


AA252524 


AW014549 Hs.58373 


ESTs 


115068 


AA253461 


AW512260 Hs.87767 


ESTs 


133138 


AA255522 


AV657594 Hs.181161 


Homo sapiens cDNA FLJ14643 fis, done NT 


332668 


AA255522 


AV657594 Hs.181161 


ESTs 


115114 


AA256468 


AA527548 Hs.7527 


small fragment nuclease 


129584 


AA256528 


AV656017 Hs.184325 


CGI-76 protein 


115137 


AA257976 


AW968304 Hs.56156 


ESTs 


417187 


AA258296 


AB011151 Hs.334659 


hypothetical protein MGC14139 


115166 


AA258409 


AF095727 Hs.287832 


myelin protein zero-like 1 


115167 


AA258421 


AA749209 Hs.43728 


hypothetical protein 


436719 


AA262077 


Y11192 Hs.5299 


aldehyde dehydrogenase 5 family, member 


115239 


AA278650 


BE251328 Hs.73291 


hypothetical protein FLJ10881 


115243 


AA278766 


M806600 Hs. 11 6665 


KIM1842 protein 


428419 


AA280791 


U4943S 


KIAA1856 protein 


115322 


AA280819 


L08895 Hs.78995 


MADS box transcription enhancer factor 2 


413303 


AA280828 


AW836130 Hs.75277 


hypothetical protein FLJ13910 


115372 


AA282195 


AW014385 Hs.88678 


ESTs, Weakly similar to Unknown [H.sapie 


409962 


AA283127 


U82671 Hs.57698 


Target CAT 


130269 


AA284694 


F05422 Hs.168352 


nucleoporin-like protein 1 


456570 


AA291137 


AA286914 Hs.183299 


ESTs 


332675 


AA291708 


BE439944 


ESTs 


407864 


AA293495 


AF069291 Hs.40539 


chromosome 8 open reading frame 1 




AA347193 


At/nn-i^RQ Ue colon 
AKUlmOO nS.O^loU 


anillin (Drosophila Scraps homolog), act 


408799 


AA398474 


AA059412 Hs.47986 


hypothetical protein MGC10940 


115575 


AA398512 


AA393254 Hs.43619 


ESTs 


115601 


AA400277 


AA148984 Hs.48849 


ESTs, Weakly similar to ALU4_HUMAN ALU S 


434428 


AA400896 


D14540 Hs.199160 


myeloid/lymphoid or mixed-lineage leukem 


115683 


AA410345 


AF255910 Hs.54650 


junctional adhesion molecule 2 


115715 


AA416733 


BE395161 Hs.1390 


proteasome (prosome, macropain) subunit, 


132952 


AA425154 


AI658580 Hs.61426 


Homo sapiens mesenchymal stem cell prote 
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115819 


AA426573 


AA486620 


Hs.41135 


endomucin-2 


409124 


AA431418 


AW292309 


Hs.50727 


N-acetylglucosaminidese, alpha- (Sanfili 


115895 


AA436182 


AB033035 


Hs.51965 


KIAA1209 protein 


458073 


AA437099 


AA1 92669 


Hs.45032 


ESTs 


115962 


AA446585 


AI636361 


Hs.179520 


hypothetical protein MGC10702 


115967 


AA446887 


AI745379 


Hs.42911 


ESTs 


115974 


AA447224 


BE513442 


Hs.238944 


hypothetical protein FLJ10631 


115985 


AA447709 


AA447709 


Hs.268115 


ESTs, Weakly similar to T0B599 probable 


129254 


AA453624 


AA252468 


Hs.1098 


DKFZp434J1813 protein 


446730 


AA455044 


BE384932 


Hs.64313 


ESTs, Weakly similar to AF257182 1 G-pro 


116095 


AA456045 


AA043429 


Hs.62618 


ESTs 


426B56 


AA460454 


R19768 


Hs.172788 


ALEX3 protein 


116210 


AA476494 


BE622792 


Hs.172788 


ALEX3 protein 


116213 


M476738 


AA292105 


Hs.326740 


hypothetical protein MGC10947 


432645 


AA481422 


D14041 


Hs.347340 


H-2K binding factor-2 


116265 


AA482595 


BE297412 


Hs.55189 


hypothetical protein 


129334 


AA485084 


AW1 57022 Hs.343551 


hypothetical protein FLJ22584 


115274 


AA485431 


AH 29767 


Hs.182874 


guanine nucleotide binding protein (G pr 


426002 


AA489638 


BE514376 Hs.165998 


PAI-1 mRNA-binding protein 


115331 


AA491000 


N41300 


Hs.71616 


Homo sapiens mRNA; cDNA DKFZp585M1720 (f 


116333 


AA491250 


AF1 55827 


Hs.203963 


hypothetical protein FLJ10339 


132994 


AA505133 


AA1 12748 


Hs.279905 


clone HQ0310PRO0310p1 


418538 


AA598447 


BE244323 


Hs.85951 


exportin, tRNA (nuclear export receptor 


116391 


AA599243 


T86558 


Hs.75113 


general transcription factor IIIA 


116394 


AA599574 


NM_006033Hs.65370 


lipase, endothelial 


134531 


AA600153 


AI742845 


Hs.1 10713 


DEK oncogene (DMA binding) 


116417 


AA609309 


AW499664 




Human clone 23826 mRNA sequence 


116429 


AA609710 


AF191018 


Hs.279923 


putative nucleotide binding protein, est 


116439 


AA610068 


AA251594 


Hs.43913 


PIBF1 gene product 


116459 


AA621399 


R80137 


Hs.302738 


Homo sapiens cDNA: FLJ21425 f s, clone C 


427505 


AA621752 


AA361662 


Hs.178761 


26S proteasome-associated pad1 homolog 


409633 


C21523 


AW449822 Hs.55200 


ESTs 


116541 


D12160 


D12160 


Hs.249212 


polymerase (RNA) III (DNA directed) (155 


132557 


D19708 


AA1 14926 


Hs.169531 


ESTs 


414964 


D25801 


AA337548 


Hs.333402 


hypothetical protein MGC12760 


116571 


D45652 


D45652 


Hs.211604 


gb:HUMGS02848 Human adult lung 3' direct 


451522 


D60208 


BE565817 


Hs.26498 


hypothetical protein FLJ21657 


421919 


D80504 


AJ224901 


Hs.109526 


zinc finger protein 198 


116643 


F03010 


A1367044 


Hs.153638 


myeloid/lymphoid or mixed-lineage leukem 


116661 


F04247 


R61504 




gb:yh1 6a03.s1 Soares infant brain 1 NIB H 


116715 


F10966 


AL1 17440 


Hs.170263 


tumor protein p53-binding protein, 1 


116729 
318709 


F13700 
H05063 


BE549407 
R52576 


Hs.1 15823 
Hs.285280 


ribonuclease P, 40kD subunit 

Homo sapiens cDNA: FLJ22096 fis, clone H 


418999 


H16758 


NMJ00121HS.89548 


erythropoietin receptor 


116773 


H17315 


AI823410 


Hs.343581 


karyopherin alpha 1 (importiri alpha 5) 


116780 


H22566 


H22566 


Hs.63931 


ESTs 


453884 
116819 


H48459 
H53073 


AA355925 
H53073 


Hs.36232 
Hs.93698 


KIAA0186 gene product 
EST 


427278 


H56559 


AL031428 


Hs.174174 


KIAA0601 protein 


407833 


H57957 


AW955632 Hs.66666 


ESTs, Weakly similar to S19560 proline-r 


116844 


H64938 


H64938 


Hs.337434 


ESTs, Weakly similar to A46010X-linked 


116845 


H64973 


AA649530 


Hs.348148 


gb:ns44f05.s1 NCI_CGAP_Alv1 Homo sapiens 


116892 


H69535 


A1573283 


Hs.38458 


ESTs 


116925 


H73110 


H73110 


Hs.260603 


ESTs, Moderately similar to A47582 B-cel 


116981 


H81783 


N29218 


Hs.40290 


ESTs 


453133 


H85259 


AC005757 


Hs.31809 


hypothetical protein 


117031 


H88353 


H88353 


Hs.347265 


gb:yw21a02.s1 Morton Fetal Cochlea Homo 


117034 


H83639 


U72209 




YY1-associated factor 2 


431129 


H83675 


AL137751 


Hs.263671 


Homo sapiens mRNA; cDNA DKFZp434l0812 (f 


417861 


H93708 


AA334551 




sperm specific antigen 2 


117280 


N22107 


M18217 


Hs.172129 


Homo sapiens cDNA: FLJ21409 fis, clone C 


117344 


N24046 


R19085 


Hs.210706 


Homo sapiens cDNA FLJ13182 fis, clone NT 


117422 


IM27028 


AI355562 


Hs.43880 


ESTs, Weakly similar to A46010 X-llnked 


117475 


N30205 


N30205 


Hs.93740 


ESTs, Weakly similar to I38022 hypotheti 


117487 


N30621 


N30621 


Hs.44203 


ESTs 


117937 


N33258 


AF044209 


Hs.144904 


nuclear receptor co-repressor 1 


130207 


N33258 


AF044209 


Hs.144904 


nuclear receptor co-repressor 1 


117549 


N33390 


N33390 


Hs.44483 


EST 




N40180 


N40180 






117710 


N45198 


N45198 


Hs.47248 


ESTs, Highly similar tosimilar to Cdc14 " 


117791 


N48325 


N48325 


Hs.93956 


EST 


117822 


N48913 


AA706282 


Hs.93963 


ESTs 


422544 


N49394 


AB018259 


Hs.118140 


KIAA0716 gene product 


117895 


N50656 


AW450348 


Hs.93996 


ESTs, Highly similar to SORL_HUMAN SORTI 


452259 


N50721 


AA317439 


Hs.28707 


signal sequence receptor, gamma (translo 


133057 


M53143 


AA465131 


Hs.64001 


Homo sapiens clone 2521 8 mRNA sequence 
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118103 


N55326 


AA401733 


Hs.184134 


ESTs 


118111 


N55493 


N55493 




gb:yv50c02.s1 Soares fetal liver spleen 


118129 


N57493 


N57493 




gb;yy54c08.s1 Soares_multiple_sclerosis_ 


118278 
118329 


N62955 
N63520 


N62955 
N63520 


Hs.316433 


Homo sapiens cDNA FLJ11375 fis, clone HE 
gb:yy62f01,s1 Soares_multiple_solerosis_ 


118335 


N63604 


BE327311 


Hs.47166 


HT021 


417098 


N64166 


AB017355 


Hs.173859 


frizzled (Drosophila) homolog 7 


118363 


N64168 


AI183838 


Hs.48938 


hypothetical protein FLJ21802 


118364 


N64191 


N46114 


Hs.29169 


hypothetical protein FLJ22623 


118475 


N66845 


N66845 




gb:za46c11.s1 Soares fetal liver spleen 


118491 


N67135 


AV647908 


Hs.90424 


Hcmo sapiens cDNA: FLJ23285 fis, clone H 


118500 
118584 


N67295 
N68963 


W32889 
AW1 36928 


Hs. 154329 


ESTs 

gb:UI-H-BI1-adp-d-08-0-Ul.s1 NCI_CGAP_Su 


456647 


N69331 


AI252640 


Hs.1 10364 


peptidylproiyl isomerase C (cyclophilin 


118661 


N70777 


AL1 37554 


Hs.49927 


protein kinase NYD-SP15 


118684 


N71364 


N71313 


Hs.163986 


Homo sapiens cDNA: FU22765 fis, clone K 


118689 


N71545 


AW390601 Hs.184544 


Homo sapiens, clone ]MAGE:3355383, mRNA, 
ESTs 


118690 


N71571 


N71571 


Hs.269142 


118765 


N74456 


N74456 


Hs.50499 


EST 


118793 


N75594 


N75594 


Hs.285921 


ESTs, Moderately similar to T47135 hypot 


118817 


N79035 


AI668658 


Hs.50797 


ESTs 


118844 


N80279 


AL035364 


Hs.50891 


hypothetical protein 


118919 


N91797 


AW452696 


Hs.1 30760 


myosin phosphatase, target subunit 2 


129558 


N92454 


AW580922 Hs. 180446 


karyopherin (importin) beta 1 


407604 


N94581 


AW191962 


Hs.288061 


collagen, type VIII, alpha 2 


118996 


N94746 


N94746 


Hs.274248 


hypothetical protein FLJ20758 


119021 


N98238 


N98238 


Hs.55185 


ESTs 


119039 


R02384 


AI160570 


Hs.252097 


pregnancy specific beta-1-glycoprotein 6 


119063 


R16833 


R16833 


Hs.53106 


ESTs, Moderately similar to ALU1_HUMAN A 


332622 


R41828 


R10674 




CSR1 protein 


119111 


R43203 


T02865 


Hs.328321 


EST 


415115 


R46395 


AA214228 


Hs.127751 


hypothetical protein 


119146 


R58863 


R58863 


Hs.91815 


ESTs 


449224 


R78248 


AW995911 


Hs.299883 


hypothetical protein FLJ23399 




T11483 


T11483 




gb:CHR90049 Chromosome 9 exon Homo sapie 


119281 


T16896 


AI692322 


Hs.65373 


ESTs, Weakly similar to T02345 hypotheti 


119298 


T23820 


NM_001241Hs. 155478 


cyclin T2 


126502 


T30222 


T10077 


Hs.1 3453 


hypothetical protein FLJ14753 


419983 


W15275 


W55956 


Hs.94030 


Homo sapiens mRNA; cDNADKFZp586E1624 (f 


119558 


W38194 


W38194 




Empirically selected from AFFX single pr 


429641 


W42414 


AW081883 Hs.211578 


Homo sapiens cDNA: FLJ23037 fis, clone L 


419445 


W49632 


AA884471 


Hs.90449 


Human clone 23903 mRNA sequence 


119650 


W57613 


R82342 


Hs.79856 


ESTs, Weakly similar to S65657 alpha-1C- 


119654 


W57759 


W57759 




gb:zd20g11.s1 Soares_fetal_heart_NbHH19W 
ESTs 


119683 


W61118 


W65379 


Hs.57835 


119694 


W65344 


AA041350 Hs.57847 


ESTs, Moderately similar to ICE4_HUMAN C 


119718 


W69216 


W69216 


Hs.92848 


ESTs 


410365 


W59379 


AI287518 




Homo sapiens mRNA; cDNA DKFZp586D0923 (f 


119938 


W86728 


AW014862 


Hs.58885 


ESTs 


120128 


Z38499 


BE379320 


Hs.91448 


MKP-1 like protein tyrosine phosphatase 


120130 


Z38630 


AA045767 


Hs.5300 


bladder cancer associated protein 


120148 


Z39494 


F02806 


Hs.65765 


ESTs 


120155 


Z39623 


Z39623 


Hs.65783 


ESTs 


451979 


Z40071 


F06972 


Hs.27372 


BMX non-receptor tyrosine kinase 


120183 


Z40174 


AW082866 Hs.65882 


ESTs 


120184 


Z40182 


Z40182 


Hs.65885 


EST 


120211 


Z40904 


Z40904 


Hs.66012 


EST 


120245 


AA1 66965 


AW959615 Hs.1 11045 


ESTs 


120247 


AA1 67500 


AA167500 


Hs.103939 


EST 


120254 


AA1 69599 


W90403 


Hs.1 11054 


ESTs 


120259 


AA171724 


AW014786 


Hs.1 92742 


hypothetical protein FLJ12785 


120260 


AA171739 


AK000061 


Hs.101590 


hypothetical protein 


120275 


M177105 


AA177105 


Hs.78457 


solute carrier family 25 {mitochondrial 


120284 


AA1 82626 


AA179656 




gb:zp54e1 1 ,s1 Stratagene NT2 neuronal pr 


417735 


AA1 86324 


AA188175 


Hs.82506 


KIAA1254 protein 


422137 
120302 
120303 


AA1 92099 
AA1 92173 
AA1 92415 


AJ236885 
AA837098 
AI216292 


Hs.269933 
Hs.96184 


zinc finger protein 148 (pHZ-52) 

ESTs 

ESTs 




AA 192553 


AW295096 Hs.101337 


uncoupling protein 3 (mitochondrial, pro 


120319 


AA1 94851 


T57776 


Hs.191094 


ESTs 


408729 


AA1 95520 


AA195764 


Hs.72639 


ESTs 


120326 


AA1 96300 


AA1 96300 


Hs.21145 


hypothetical protein RG083M05.2 


133145 


M1 96549 


H94227 


Hs.6592 


Homo sapiens, clone 1MAGE:2961368, mRNA, 


120327 


M1 96721 


AK000292 


Hs.130732 


hypothetical protein FLJ20285 


120328 
120340 


AA1 96979 
AA206828 


AA923278 
AA206828 


Hs.290905 


ESTs, Weakly similar to protease [H.sapi 
gb:zq80b08.s1 Stratagene hMT neuron (937 



138 



WO 02/079492 



PCT/US02/04915 



417122 


AA207123 


AI906291 


Hs.81234 


immunoglobulin superfamily, member 3 


131522 


AA214539 


AI380040 


Hs.239489 


TIA1 cytotoxic granule-associated RNA-bi 


421787 


AA226914 


AA227068 


Hs.1 08301 


nuclear receptor subfamily 2, group C, m 


120375 


AA227260 


AF028706 


Hs.111227 


Zio family member 3 (odd-paired Drosophi 


120376 


AA227469 


AA227469 




gb:zr18a07,s1 Stratagene NT2 neuronal pr 


120390 


AA233122 


M837093 


Hs.1 11460 


calclum/calmodulin-dependent protein kin 


410804 


AA233334 


U64820 


Hs.65521 


Machado-Joseph disease (spinocerebellar 


434223 


AA233347 


AI825842 


Hs.3776 


zinc finger protein 216 


312771 


AA233714 


AA018515 


Hs.264482 


Homo sapiens mRNA; cDNA DKFZp761A0411 (f 


120396 


AA233796 


AA1 34006 


Hs.79306 


eukaryotic translation initiation factor 


120409 


AA235050 


AA235050 




gb:zs38e04.s1 Soares_NhHMPu_S1 Homosapi 


120414 


AA235704 


AW137156 


Hs.1 81 202 


hypothetical protein FLJ10038 


120420 


AA236031 


AI128114 


Hs.1 12885 


spinal cord-derived growth factor-B 


120422 


AA236352 


AL133097 


Hs.301717 


hypothetical protein DKFZp434N1928 


419326 


AA236390 


W94915 


Hs.42419 


ESTs 


120423 


AA236453 


AA236453 


Hs.1 8978 


Homo sapiens cDNA: FLJ22822 fis, clone K 


120435 


AA243370 


AA243370 


Hs.96450 


EST 


120453 


AA250947 


AA250947 


Hs.170263 


tumor protein p53-binding protein, 1 
ESTs, Weakly similar to ALUCJHUMAN !!!! 


120455 


M251083 


AA251720 Hs.104347 


120456 


AA251113 


AA488750 


Hs.88414 


BTB and CNC homology 1 , basic leucine zi 


120473 


AA251973 


AA251973 


Hs.269988 


ESTs 


128922 


AA252023 


AI244901 


Hs.9589 


ubiquilin 1 


120477 


AA252414 


AA252414 


Hs.43141 


DKFZP727C091 protein 


120479 


AA252650 


AF006689 


Hs.1 10299 


mitogen-activated protein kinase kinase 


120488 


AA255523 


AW952916 


Hs.63510 


KIAA0141 gene product 


120510 


AA258128 


AI796395 


Hs.1 11 377 


ESTs 


120527 


AA262105 


AA262105 


Hs.4094 


Homo sapiens cDNA FU 14208 fis, clone NT 


120528 


AA262107 


AI923511 


Hs.1 0441 3 


ESTs 


120529 


AA262235 


AI434823 


Hs.104415 


ESTs 


120541 


AA278298 


W07318 


Hs.240 


M-phase phosphoprotein 1 


120544 


AA278721 


BE548277 


Hs.103104 


ESTs 


120562 


AA280036 


BE244580 


Hs.342307 


hypothetical protein FU 10330 


120569 


AA280648 


M807544 


Hs.24970 


ESTs, Weakly similar to B34323 GTP-bindi 


120571 


M280738 


AB037744 


Hs.34892 


KIAA1323 protein 


120572 


AA280794 




Hs.294008 


ESTs 


129434 


AA280837 


AW957495 Hs.1 86644 


ESTs 


130529 


AA280886 


AA1 78953 


Hs.309648 


gb:zp39e03.s1 Stratagene muscle 937209 H 


120575 


AA280934 


AW978022 


Hs.238911 


hypothetical protein DKFZp762E1511; KIAA 


409339 


M281535 


AB020686 


Hs.54037 


ectonucleotidepyrophosphatase/phosphodi 


120591 


M281797 


AF078847 


Hs.191356 


general transcription factor HH, polype 


120593 


AA282047 


AA748355 


Hs.1 93522 


ESTs 


430275 


AA283002 


Z11773 


Hs.237786 


zinc finger protein 187 


440303 


AA283709 


AA306166 Hs.7145 


calpaln 7 


120609 


AA283902 


AW978721 Hs.266076 


ESTs, Weakly similar to A46010 X-linked 


409702 


AA284108 


AI752244 




eukaryotic translation elongation factor 


456870 


AA284109 


AI241084 


Hs.154353 


nonselective sodium potassium/proton exc 


132614 


AA284371 


M284371 


Hs.1 18064 


similar to rat nuclear ubiquitous casein 


458750 


M284744 


AA1 15496 


Hs.336898 


Homo sapiens, Similarto RIKEN cDNA 1810 


135376 


AA284784 


BE617856 


Hs.99756 


mitochondrial ribosome recycling factor 


120621 


AA284840 


AW961294 Hs.143818 


hypothetical protein FLJ23459 


452279 


AA286844 


AA286844 


Hs.61260 


hypothetical protein FU13164 


332484 


AA287032 


AW172431 Hs.13012 


ESTs 


120644 


AA287038 


AI869129 


Hs.96616 


ESTs 


120660 


AA287546 


AA286785 


Hs.99677 


ESTs 


135370 


M287553 


BE622187 


Hs.99670 


ESTs, Weakly similar to I38022 hypotheti 


120661 


AA287556 


AA287556 


Hs.263412 


ESTs, Weakly similarto ALUB_HUMAN III! 


429828 


AA287564 


AB019494 


Hs.225767 


IDN3 protein 


452291 


AA291015 


AF015592 


Hs.28853 


CDC7 (cell division cycle 7, S. cerevisi 


120699 


AA291716 


AI683243 


Hs.97258 


ESTs, Moderately similar to S29539 ribos 


100690 


AA291749 


AA383256 


Hs.1 657 


estrogen receptor 1 


120726 


AA293656 


AA293655 


Hs.21198 


ESTs 


120737 


AA302430 


AL049176 


Hs.82223 


chordin-like 


120745 


M302809 


AA302809 




gb:EST10426 Adipose tissue, white I Homo 


443574 


AA302820 


U83993 


Hs.321709 


purinergic receptor P2X, ligand-gated io 


120750 


AA310499 


AI191410 


Hs.96693 


ESTs, Moderately similar to 21 09260A Be 


120761 


AA321890 


AA321890 




branched chain keto acid dehydrogenase E 


120768 


AA340589 


AA340589 


Hs.1 04560 


EST 


120769 


M340622 


AI769467 


Hs.9475 


ESTs 




AA342457 


AL038812 


Hs.96800 


ESTs, Moderately similarto ALU7 HUMAN A 
ESTs' 


120793 


AA342864 


AA342864 Hs.96812 


120796 


AA342973 


AI247356 


Hs.96820 


ESTs 


120809 


AA346495 


AA346495 




gb:EST52657 Fetal heart II Homo sapiens 




AA347573 


AL120071 


Hs.48998 


tibronectin leucine rich transmembrane p 


120825 


AA347614 


AI280215 


Hs.96885 


ESTs 


120827 


AA347717 


AA382525 


Hs.132967 


Human EST clone 122887 mariner transposo 


120839 


AA348913 


AA348913 




gb:EST55442 Infant adrenal gland II Homo 
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120850 AA349647 


AA349647 


Hs.96927 


Homo sapiens oDNA FLJ12573 fis, clone NT 


120852 AA349773 


AA349773 


Hs.191564 


ESTs 


128852 AA350541 


R40622 


Hs.106601 


ESTs 


135240 AA357159 


AA357159 


Hs.96985 




120870 AA357172 


AA357172 


Hs.292581 


ESTs, Moderately similar to ALU1.HUMAN A 


120894 AA370132 


AA370132 


Hs.97063 


ESTs 


435737 AA370472 


AF229839 


Hs.173202 


l-kappa-B-interacting Ras-like protein 1 


120897 AA370867 


AA370867 


Hs.97079 


ESTs, Moderately similar to AF174605 1 F 


120915 AA37729S 


AL1 35556 


Hs,97104 


ESTs 


120935 AA383902 


AL048409 


Hs.97177 


ESTs, Weakly similar to ALU1_HUMAN ALU S 


120936 AA385934 


AA385934 


Hs.97184 


EST, Highly similar to (define notavai 


120937 AA386255 


AA386255 


Hs.97186 


EST 


120938 AA386260 


AA386260 


Hs.104632 


EST 


417632 AA386266 


R20855 


Hs.5422 


glycoprotein M6B 


120960 AA398014 


AA398014 


Hs.104684 


EST 


120985 AA398222 


AI219896 


Hs.97592 


ESTs 


120988 AA398235 


AA398235 


Hs.97631 


ESTs 


121008 AA398348 


AA398348 


Hs. 130546 


Human DNA sequence from clone RP1 1-251J8 


121029 AA398482 


AA398482 


Hs.97641 


EST 


121032 AA398504 


AA393037 


Hs.161798 


ESTs 


121033 AA398505 


AA398505 


Hs.97360 


ESTs 


121034 AA398507 


AL389951 


Hs.271623 


nucleoporin 50kD 


121035 AA398523 


AA398523 


Hs.210579 


ESTs 


121058 AA398625 


AA398625 


Hs.97391 


ESTs 


121060 AA398632 


AA398632 


Hs.97395 


ESTs 


121061 AA398633 


AA393288 


Hs.97396 


ESTs 


121091 AA398894 

121092 AA398895 


AA398894 
AA398895 


Hs.97657 
Hs.97658 


ESTs, Moderately similar to ALU8_HUMAN A 
EST 


121094 AA398900 


AA402505 




gb:zt62h10.r1 SoaresJestis.NHT Homo sap 


121096 AA398904 


AA398904 


Hs.332690 


ESTs 


121115 AA399122 


AA398187 


Hs.104682 


ESTs, Weakly similar to mitochondrial ci 


121121 AA399371 


AA399371 


Hs.189095 


similar to SALL1 (sal (Drosophila)-like 


121122 AA399373 


AI126713 


Hs.192233 


ESTs, Highly similar to T00337 hypotheti 


121125 AA399441 


AL042981 


Hs.251278 


KIAA1201 protein 


121151 AA399636 


AA399636 


Hs.143629 


ESTs 


121153 AA399640 


AA399640 


Hs.97694 


ESTs 


121163 AA399680 


AI676062 


Hs.1 11902 


ESTs 


121176 M4C0080 


AL121523 


Hs.97774 


ESTs 


121192 M4C0262 


AA400262 


Hs.190093 


ESTs 


121223 AA400725 


AI002110 


Hs.97169 


ESTs, Weakly similar to dJ667H12,2,1 [H. 


121227 AA400748 


AA400748 


Hs.97823 


Homo sapiens mRNA; cDNA DKFZp434D024 (fr 


121231 AA400780 


AA814948 


Hs.96343 


ESTs, Weakly similar to ALUCJHUMAN !l!i 


121278 AA401631 


AA037121 


Hs.98518 


Homo sapiens cDNA FLJ11490 fis, clone HE 


121279 AA401688 


AA292873 


Hs.177996 


ESTs 


121282 AA401695 


AA401695 


Hs.97334 


ESTs 


121299 AA402227 


AA402227 


Hs.22826 


tropomcdulin 3 (ubiquitous) 


121301 AA402329 


NM_006202Hs.89901 


phosphodiesterase 4A, cAMP-specific (dun 


121302 AA402398 


AA402587 


Hs.325520 


LAT1-3TM protein 


121304 AA402449 


AA293863 


Hs.97316 


EST 


121305 AA402468 


AA402468 


Hs.291557 


ESTs 


134721 AA403268 


AK000112 


Hs.89306 


hypothetical protein FLJ20105 


121323 AA403314 


AA291411 


Hs.97247 


ESTs 


121324 AA404229 


AA404229 


Hs.97842 


EST 


444422 AA404260 


A1768623 


Hs.108264 


ESTs 


131074 AA404271 


U16125 


Hs.181581 


glutamate receptor, ionotropic, kainate 


121344 AA405026 


AA405026 


Hs.193754 


ESTs 


121348 M405182 


AA405182 


Hs.97973 


ESTs 


121350 AA405237 


AA405237 




gb:zt06e10.s1 NCI_CGAP_GCB1 Homo sapiens 


121400 AA406061 


M406061 


Hs.98001 


EST 


121402 AA406063 


AA406063 


Hs.98003 


ESTs 


121403 AA406070 


AA406070 


Hs.98004 


EST 


121408 AA406137 


AA406137 


Hs.98019 


EST 


121431 AA406335 


AA035279 


Hs.176731 


ESTs 


121471 AA411804 


AA411804 


Hs.261575 


ESTs 


121474 AA411833 


AA402335 


Hs.188760 


ESTs, Highly similarto Trad [H.sapiens] 


121526 M412219 


AW665325 


Hs.98120 


ESTs 


121530 AA412259 


AA778658 


Hs.98122 


ESTs 


121558 AA412497 


AA412497 




gb:zt95g12.s1 SoaresJestis.NHT Homo sap 


121559 AA412498 


AI192044 


Hs.104778 




121584 AA416586 


AI024471 


Hs.98232 


ESTs 


121609 AA416867 


AA416867 


Hs.98185 


EST 


121612 M416874 


AA416874 


Hs.98168 


ESTs 


121737 AA421133 


AA421133 


Hs.104671 


erythrocyte transmembrane protein 


121740 AA421138 


AA421138 


Hs.143835 


EST 


436032 AA422079 


AA150797 


Hs.109276 


latexin protein 


121784 AA423837 


T90789 


Hs.94308 


RAB35, member RAS oncogene family 
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121802 AA424328 

121803 AA424339 
135286 AA424469 
332778 AA424469 
121806 AA424502 
129517 AA425004 
121845 AA425734 
121853 AA425887 
121891 AA426455 
121895 AA427395 
121899 AA427555 

121917 AA428218 

121918 AA428242 

121919 AA428281 

121941 AA428855 

121942 AA428994 
121970 AA429666 
121993 AA430181 
418706 AA430184 



122050 AA431478 

122051 AA431492 
122055 AA431732 
122105 AA432278 

•122125 AA434411 

135235 M435512 

122162 AA435698 

422072 AA435711 

415106 AA435815 

122186 AA435842 

122235 AA436475 

412970 AA436489 

419288 AA442060 

122310 AA442079 

122334 AA443151 

122382 AA446133 

122425 AA447145 

122431 AA447398 

122450 AA447643 

426284 AA447742 

122477 AA448226 

122500 AA448825 

122522 AA449444 

122536 AA450087 

122538 AA450211 

122540 AA450244 

122560 AA452123 

421919 AA452155 

122562 AA452156 



122608 AA453526 

122635 AA454085 

122636 AA454103 
122653 AA454642 
122660 AA454935 
122703 AA456323 
122724 AA457395 
122749 / 
122772 t 
430242 AA459668 



122777 M459702 
135362 AA460017 
122798 AA460324 
122837 AA461509 

122860 AA464414 

122861 AA464428 
122910 AA470084 
132899 AA476606 
122967 M478521 
422845 AA478523 



128917 AA481252 

123081 AA485351 

123133 M487264 

123184 AA489072 



AI251870 Hs.188898 
AI338371 Hs.157173 
AW023482 Hs.97849 
AW023482 Hs.97849 
AA424313 Hs.98402 
AW972853 Hs.1 12237 
A1732692 Hs.165066 
AA425887 Hs.98502 
AA426456 Hs.98469 
AA427396 

R55341 Hs.50421 
AA406397 Hs.1 39425 
BE274689 Hs.184175 
AA428281 Hs.98560 
AA428865 Hs.98563 
AW452701 Hs.293237 
AA429666 Hs.98617 
AW297880 Hs.98661 
U73524 Hs.87465 
AA431293 Hs.98716 
AI453076 

AA431492 Hs.98742 
AA431732 Hs.98747 
AW241685 Hs.98699 
AK000492 Hs.98806 
AW298244 Hs.266195 
AA628233 Hs.79946 
AB018255 Hs.1 11 138 
U40763 Hs.77965 
M398811 Hs.104673 
AA436475 Hs.1 12227 
AB026436 Hs.177534 
AA256106 Hs.87507 
AW192803 Hs.98974 
BE465894 Hs.98365 
AA446440 Hs.98643 
AB007859 Hs.100955 
M447398 Hs.99104 
AA447643 Hs.112095 
AJ404468 Hs.284259 
AA448226 Hs.324123 
AA448825 Hs.99190 
AA299607 Hs.98969 
AF060877 Hs.99236 
AA450211 Hs.99239 
AA476741 Hs.98279 
AW392342 Hs.283077 
AJ224901 Hs.109526 
AA452156 

AI681654 Hs.170737 
AA453525 Hs.143077 
AA454085 

AW651706 Hs.99519 
AW009166 Hs.99376 
AI816827 Hs.180069 
AA456323 Hs.269369 
AA457395 Hs.99457 
M458850 Hs.293372 
AW1 17452 Hs.99489 
U66669 Hs.236642 
AW904907 Hs.30732 
AK001022 Hs.214397 
AA978128 Hs.99513 
AW366286 Hs.145696 
M461509 Hs.293565 
AA464414 

AA335721 Hs.213628 
M470084 Hs.98358 
AA476606 Hs.59666 
M806187 Hs.289101 
AA317841 Hs.7845 
AA535244 Hs.78305 
AI365215 Hs.206097 
AI815486 Hs.243901 
AA487264 Hs.154974 
BE247767 Hs.1 81 66 



ESTs 
ESTs 
ESTs 
ESTs 
ESTs 
ESTs 

ESTs, Moderately similar to ALU2_HUMAN A 

hypothetical protein FU14303 

ESTs 

gb:zw33a02.s1 Soares ovary tumor NbHOT H 

KIAA0203 gene product 

ESTs 

chromosome 2 open reading frame 3 
EST 



ATP/GTP-binding protein 

ESTs, Moderately similar to T42650 hypot 

ELAV (embryonic lethal, abnormal vision, 

EST 

EST 

ESTs 

hypothetical protein 
ESTs 

cytochrome P450, subfamily XIX (aromatiz 
KIAA0712 gene product 
peptidyl-prolyl isomeraseG (cyclophilin 
ESTs 

membrane-associated nucleic acid binding 
dual specificity phosphatase 10 
ESTs 

ESTs, Weakly similar to S65824 reverse t 
ESTs, Weakly similar to LB4D_HUMAN NADP- 
ESTs 

KIAA0399 protein 
ESTs 

hypothetical protein DKFZp434F1819 
dynein, axonemal, heavy polypeptide 9 
ESTs 
ESTs 



ESTs, Weakly similar to A43932 mucin 2 p 



zinc finger protein 198 

gb:zx29c03.s1 Soares_total_fetus_Nb2HF8_ 

hypothetical protein FLJ23251 

ESTs 

gb-.zx33aD8.s1 Soares_total_fetus_Nb2HF8_ 

hypothetical protein FLJ 14007 

ESTs 

nuclear respiratory factor 1 

ESTs 

ESTs 

ESTs, Weakly similar to B34087 hypotheti 
ESTs 

3-hydroxyisobutyryl-Coenzyme A hydrolase 
hypothetical protein FLJ13409; KIAA1711 
hypothetical protein FLJ10160 similarto 
ESTs, Weakly similar to T17454 diaphanou 
splicing factor (CC1.3) 
ESTs, Weakly similar to putative p150 [H 
gb:zx78g01.s1 Soares ovary tumor NbHOT H 
ESTs 
ESTs 
SMAD in 



hypothetical protein MGC2752 
RAB2, member RAS oncogene family 
oncogene TC21 

Homo sapiens cDNA FLJ20738 fis, clone HE 
Homo sapiens mRNA; cDNA DKFZp667N064 (fr 
KIAA0870 protein 
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332467 AA489630 

123233 AA490225 

123234 AA490227 
123236 AA490255 



123284 AA495812 
10 123286 AA495824 
123315 AA496369 
457397 AA504125 
433049 AA521473 
123421 AA598440 
15 123449 AA598899 
426981 AA599244 



123604 A 
123712 A 
123731 A 
123800 A 
123841 A 



133184 D20085 
132835 D20749 
435147 D51285 
128695 D59972 
124029 F04112 
124057 F13604 
H01662 
H05135 
H 12245 
H22842 



130973 
124106 
124136 
124165 
429627 H43442 
124178 H45996 



427580 I- 
426793 I- 
124274 r 



124315 
437712 

50 124324 
452933 
132231 
421877 
443123 

55 132963 
420473 
417381 
130365 
456610 

60 439311 
124383 
124387 
129341 
419793 

65 124433 
124441 



124483 
124484 
124485 
124494 



H80737 
H93412 
H94892 
H95643 
H96552 
H97146 
H99131 
H99462 
H99837 
N22140 
N22197 
N23756 
N24134 
N24195 
N26739 
N27098 
N27637 
N33090 
N35967 
N39069 
N46441 
N48270 
N48365 



N54157 
N54300 
N54831 



NM 014700HS.1 19004 
AW974175 Hs.151875 
NM_001938Hs.16697 
AW968504 Hs.123073 
AA830335 Hs. 105273 
AW768399 Hs.106357 
AF084535 Hs.22464 
AI744152 Hs.283374 
AA488988 Hs.293796 
AA495824 Hs. 188822 



AW969025 Hs.109154 
AU076668 Hs.334884 
AA598440 Hs.291154 
AL049325 Hs.1 12493 
AL044675 Hs.173081 
NM_014777Hs.57730 
AA765256 Hs.135191 
AA609135 Hs.293076 
AA609684 



AA620423 Hs.1 12862 
AA620747 Hs.1 12896 
AA621364 Hs.1 12981 
T89832 Hs.170278 
AA001021 Hs.6685 
283844 Hs.5790 
AL133731 Hs.4774 
NM_003478Hs.101299 
F04112 Hs.312553 
AA902384 Hs.73853 
AI609045 Hs.321775 
AI638418 Hs.1440 
H12245 

H22842 Hs.101770 
H30039 Hs. 107674 
NM_015340Hs.2450 
BE463721 Hs.97101 
AI537162 Hs.263988 
N22687 Hs.8236 
H69899 H69899 
AI769958 Hs.1 08336 
AK001507 Hs.44143 
X89887 Hs.1 72350 
H80552 Hs.102249 
AI351010 Hs.102267 
AW952124 Hs.13094 
NM_005402Hs.288757 
X04588 Hs.85844 
H96552 Hs.159472 
AW391423 Hs.288555 
AA662910 Hs.42635 
AW250380 Hs.109059 
AA094538 Hs.272808 
AA099693 Hs.34851 
AL1 18782 Hs.300208 
AF164142 Hs.82042 
W56119 Hs.155103 
AF172066 Hs.106346 
BE270668 Hs.151945 
N27098 Hs.102463 
N27637 Hs.109019 
AI193519 Hs.226396 
AI364933 Hs.168913 
AA280319 Hs.288840 
AW450481 Hs.1 61 333 
AA353868 Hs.1 82982 
AI473114 

R10084 Hs.113319 
NM_007203Hs.42322 
AI821780 Hs.179864 
H66118 Hs.285520 
AB040933 Hs.1 5420 
N54831 Hs.271381 
N59849 Hs.1 3565 
N79264 Hs.269104 



KIAA0665 gene product 

ESTs, Weakly similar to MAPB_HUMAN MICRO 

down-regulator of transcription 1 , TBP-b 

CDC2-related protein kinase 7 

ESTs 

ESTs 

epilepsy, progressive myoclonus type 2, 
ESTs, Weakly similar to CA15_HUMAN COLLA 
ESTs 

ESTs, Weakly similar to A46010 X-linked 
gb:zv37d10,s1 Soares ovary tumor NbHOT H 
ESTs 

SEC10 (S. cerevlslae)-like 1 

EST, Weakly similar to 138022 hypothetic 

Homo sapiens mRNA; cDNA DKFZp5S4D036 (fr 

KIAA0530 protein 

KIAA01 33 gene product 

ESTs, Weakly similar to unnamed protein 

ESTs 

Homo sapiens cDNA: FLJ21543 fis, clone C 

gb:ae62f01.s1 Stratagene lung carcinoma 

EST 

ESTs 

ESTs 

ESTs 

thyroid hormone receptor interactor 8 

hypothetical protein dJ37E16.5 

Homo sapiens mRNA; cDNA DKFZp761C1712 (t 

cullin 5 

gb:HSC2JH062 normalized infant brain cDN 
bone morphogenetic protein 2 
hypothetical protein DKFZp434D1428 
DEAD/H (Asp-Glu-Ala-Asp/His) box pol/pep 
gb:ym17a12.r1 Soares infant brain 1NIBH 
EST 
ESTs 

leucyl-tRNA synthetase, mitochondrial 
putative G protein-coupled receptor 
ESTs 
ESTs 

gb:yu70c12.s1 Weizmann Olfactory Epithel 
ESTs, Weakly similarto ALUEJHUMAN 111! 
Homo sapiens clone FLB6914 PR01821 mRNA, 
HIR (histone cell cycle regulation defec 
EST 

lysosomal 

presenilis associated rhomboid-like pro 
v-ral simian leukemia viral oncogene horn 
neurotrophic tyrosine kinase, receptor, 
Homo sapiens cDNA: FLJ22224 fls, clone H 
Homo sapiens cDNA: FLJ22425 fis, clone H 
hypothetical protein DKFZp434K2435 
mitochondrial ribosomal protein L12 
putative transcription regulation nuclea 
epsilon-tubulin 

Sec23-interacling protein p1 25 
solute carrier family 23 (nucleobase tra 
eukaryotio translation initiation factor 
retinoicacid repressible protein 
mitochondrial ribosomal protein L43 
EST 
ESTs 

hypothetical protein FU11126 

serine/threonine kinase 24 (Ste20, yeast 

PR01 575 protein 

ESTs 

golgin-67 

ESTs 

kinesin heavy chain member 2 
A kinase (PRKA) anchor protein 2 
ESTs 

ESTs, Weakly similar to 2109260A B cell 
KIAA1500 protein 

ESTs, Weakly similar to 138022 hypotheti 
Sam68-like phosphotyrosine protein, T-ST 
ESTs 



142 



WO 02/079492 



PCT/US02/04915 



124532 N62375 
133213 N63138 
124539 N63172 
129196 N63787 
5 124575 N68168 

124576 N68201 

124577 N68300 

124578 N68321 
124593 N69575 

10 128501 N75007 
332434 N75542 
128473 N90066 
128639 N91246 
124652 N92751 

15 133137 N93214 
124671 N99148 
133054 R07876 
425266 R10865 
124720 R11056 

20 124722 R11488 
128944 R23930 
132965 R26589 
426504 R37588 
438828 R37613 

25 124757 R38398 
124762 R39179 
124773 R40923 
135266 R41179 
427961 R41294 

30 414303 R42307 
128540 R43189 
124785 R43306 

124792 R44357 

124793 R44519 
35 124799 R45088 

124812 R47948 
124821 R51524 
424123 R54950 
124835 R55241 

40 124845 R59585 
124847 R60044 
440630 R60872 
124861 R66690 
332503 R67256 

45 124879 R73588 
124892 R79403 
124906 R87647 
124922 R93622 
124940 R99599 

50 124941 R99612 
124943 T02888 
124947 T03170 
124954 T10465 
456862 T15418 

55 410653 T15597 
418133 T15652 
440014 T16898 
131082 T26644 
124980 T40841 

60 124984 T47566 
124991 T50116 
457222 T50145 
125000 T58615 
132932 T59940 

65 444484 T63595 

125008 T64891 

125009 T64924 
• 445384 T64933 

125017 T68875 
70 125018 T69027 
125020 T69924 
437871 T70353 
134204 T79780 
125050 T79951 
75 125052 T80174 
125054 T80622 



N62375 Hs.1 02731 
AA903424 Hs.6786 
D54120 Hs.146409 
BE296313 Hs.265592 
N68168 
N68201 

N68300 Hs.138485 
N68321 Hs.231500 
N69575 Hs.1 02788 
AL1 33572 Hs.1 99009 
AI680737 Hs.289068 
T78277 Hs.1 00293 
AW582962 Hs.102897 
W19407 Hs.3862 
AB002316 Hs.65746 
AK001357 Hs.1 02951 
AA464836 Hs.291079 
J00077 Hs.155421 
R05283 

T97733 Hs.1 85685 
AL1 37586 Hs.52763 
AI248173 Hs.191460 
AW162919 Hs.170160 
AL134275 Hs.6434 
H11368 Hs.141055 
AA553722 Hs.92096 
R45154 Hs.338439 
R41179 Hs.97393 
AW293165 Hs.143134 
NM_004427Hs.165263 
AW297929 Hs.328317 
W38537 Hs.280740 
R44357 Hs.48712 
R44519 
R45088 

R47948 Hs.188732 
H87832 Hs.7388 
AW966158 Hs.58582 
R55241 Hs.101214 
R59585 Hs.101255, 
W07701 Hs.304177 
BE561430 Hs.239388 
R67567 Hs.107110 
NM 004455HS.150956 
R73588 Hs.101533 
AI970003 Hs.23756 
H75964 Hs.107815 
R93622 Hs.12163 
AF068846 Hs.103804 
AI766661 Hs.27774 
AW963279 Hs.123373 
T03170 Hs.100165 
AW964237 Hs.6728 
U55184 Hs.1 54145 
BE383768 Hs.65238 . 
R43504 Hs.6181 
AW960782 Hs.6856 
AI091121 Hs.246218 
T40841 Hs.98681 
BE313210 Hs.334798 
T50116 

NM_004477Hs.203772 
T58615 Hs.235887 
AW1 18826 Hs.6093 
AK002126 Hs.11260 
T91251 

T64924 Hs.303046 
T79136 Hs.127243 
T68875 

T69027 Hs.269481 
T69981 

A1084813 Hs.1 14088 
AI873257 Hs.7994 
AW970209 Hs.111805 
T85104 Hs.222779 
T80622 Hs.268601 



EST 
ESTs 

cell division cycle 42 (GTP-binding prot 
ESTs, Weakly similar to I38022 hypotheti 
gb:za11c01.s1 Soares fetal liver spleen 
ESTs, Weakly similar to 138022 hypotheti 
gb:za12g07,s1 Soares fetal liver spleen 
EST 
ESTs 

protein containing CXXC domain 2 
Homo sapiens cDNA FLJ1 1918 fis; clone HE 
O-linked N-acetylglucosamine (GlcNAc) tr 
CGI-47 protein 

regulator of nonsense transcripts 2; DKF 
KIAA0318 protein 

Homo sapiens cDNA FLJ10495 fis, clone NT 
ESTs, Weakly similar to T27173 hypotheti 
alpha-fetoprotein 

gb:ye91c08.s1 Soares fetal liver spleen 
ESTs 

anaphase-promoting complex subunit 7 

hypothetical protein MGC12936 

RAB2, member RAS oncogene family-like 

hypothetical protein DKF2p761F2014 

Homo sapiens clone 23758 mRNA sequence 

ESTs, Moderately similar to A46010 X-lin 

ESTs 

KIAA0328 protein 
ESTs 

early development regulator 2 (homolog o 
EST 

hypothetical protein MGC3040 
hypothetical protein FLJ20736 
gb:yg24h04.s1 Soares infant brain 1NIB H 
gb:yg38g04.s1 Soares infant brain 1NIB H 
ESTs 

kelch(Drosophila)-like3 

Horr,o sapiens cDNA FLJ12789 fis, clone NT 

EST 

ESTs 

Homo sapiens clone FLB8503 PR02286 mRNA, 
Human DNA sequence from clone RP1-304B14 
ESTs 

exostoses (multiple)-like 1 
ESTs 

hypothetical protein similar to swine ac 
ESTs 

eukaryotic translation initiation factor 
heterogeneous nuclear ribonucleoprotein 
ESTs, Highly similar to AF161349 1 HSPC0 
ESTs, Weakly similar to ALU1_HUMAN ALU S 
ESTs 

KIAA1548 protein 

hypothetical protein FLJ11585 

95 kDa retinoblastoma protein binding pr 

ESTs 

ash2 (absent, small, or homeotic, Drosop 
Homo sapiens cDNA: FLJ21781 fis, clone H 
ESTs 

eukaryotic translation elongation factor 
gb:yb77c10.s1 Stratagene ovary (937217) 
FSHD region gene 1 
ESTs 

Homo sapiens cDNA: FU22783 fis, clone K 
hypothetical protein FLJ1 1264 
gb:yd60a10.s1 Soares fetal liver spleen 
ESTs 

Homo sapiens mRNA for KIAA1724 protein, 
gb:yc30f05,s1 Stratagene liver (937224) 
sex comb on midleg homolog 1 
gb:yc19d03.r1 Stratagene lung (937210) H 
ESTs 

hypothetical protein FLJ20551 
ESTs 

ESTs, Moderately similar to similar to N 
ESTs, Weakly similar to envelope [H.sapi 



143 



WO 02/079492 



125063 T85352 


T85352 




gb:yd82d01,s1 Soares fetal liver spleen 


125064 T85373 


T85373 




gb:yd82f07.s1 Soares fetal liver spleen 


125066 T86284 


T86284 




gh:yd77b07.s1 Soares fetal liver spleen 


416507 T89579 


AL045364 


Hs.79353 


transcription factor Dp-1 


125080 T90360 


T90360 


Hs.268620 


ESTs, Highly similar to ALU6 HUMAN ALU S 


125097 T94328 


AW576389 Hs.335774 


EST, Moderately similar to S65657 alpha- 


125104 T95590 


T95590 




gb:ye40a03,s1 Soares fetal liver spleen 


135107 T97257 


T97257 


Hs.94560 


ESTs, Moderately similar to I38022 hypot 


423122 T97599 


AA845462 


Hs.124024 


deltex (Drosophila) homolog 1 


125118 T97620 


R10606 


Hs.269890 


gb:yf35f11.s1 Soares fetal liver spleen 


125120 T97775 


T97775 


Hs.1 00717 


EST 


134160 T98152 


T98152 


Hs.79432 


fibrillin 2 (congenital contractural ara 


125136 W31479 


AW962364 Hs.129051 


ESTs 


125144 W37999 


AB037742 


Hs.24336 


KIAA1321 protein 


125150 W38240 


W38240 




Empirically selected from AFFX single pr 


450142 W40150 


AW207469 Hs.24485 


chondroitin sulfate proteoglycan 6 (bama 


131987 W45435 


AW453069 Hs.3657 


activity-dependent neuroprotective prote 


125178 W58202 


W93127 


Hs.31845 


ESTs 


125180 W58344 


W58469 


Hs.103120 


ESTs 


125182 W58650 


AA451755 


Hs.263560 


ESTs 


446888 W68736 


AL030996 


Hs.16411 


hypothetical protein LOC57187 


125197 W69106 


AF086270 


Hs.278554 


heterochromatin-like protein 1 


133497 W69111 


BE617303 


Hs.74266 


hypothetical protein MGC4251 


429922 W69399 


297630 


Hs.226117 


H1 histone family, member 0 


129232 W69459 


R98881 


Hs.109655 


sex comb on midleg (Drosophila)-like 1 


422166 W72424 


W72424 


Hs.1 12405 


S100 calcium-binding protein A9 (calgran 


125209 W72724 


W72724 


Hs.103174 


ESTs, Weakly similar to TSP2 HUMAN THROM 


125212 W72834 


AA746225 


Hs.103173 


ESTs 


456631 W73955 


BE383436 


Hs.108847 


hypothetical protein MGC2749 


125223 W74701 


AI916269 


Hs.109057 


ESTs, Weakly similar to ALU5_HUMAN ALU S 


125225 W76540 


W74169 


Hs.16492 


DKF2P564G2022 protein 


125228 W79397 


AA033982 


Hs.110059 


ESTs, Weakly similar to 138022 hypotheti 


132393 W85888 


AL135094 


Hs.47334 


hypothetical protein FU14495 


125238 W86038 


N99713 


Hs.109514 


ESTs 


125247 W86881 


AA694191 


Hs.163914 


ESTs 


129296 W87804 


AI051967 


Hs.110122 


ESTs 


125263 W88942 


AA098878 




gb:zn45g1 0.r1 Stratagene HeLa cell s3 93 


125266 W90022 


W90022 


Hs.1 86809 


ESTs, Highly similar to LCT2_HUMAN LEUKO 


450862 W92272 


U91543 


Hs.25601 


chromodomain helicaseDNA binding protei 


452401 W92764 


NMJ07115HS.29352 


tumor necrosis factor, alpha-induced pro 


428243 W93040 


H05317 


Hs.283549 


ESTs 


125277 W93227 


W93227 


Hs.103245 


EST 


125278 W93523 


AI218439 


Hs.129998 


enhancer of pclycomb 1 


125280 W93659 


AI123705 


Hs.106932 


ESTs 


448205 W94003 


W93949 


Hs.33245 


ESTs 


131844 W94401 


AI419294 


Hs.324342 


ESTs 


125284 W94688 


NM_002666Hs.103253 


perilipin 


417111 W94787 


AW016321 Hs.82306 


destrin (actin depolymeiizing factor) 


445424 238294 


AB028945 


Hs.12696 


cortactin SH3 domain-binding protein 


12528S Z38311 


T34530 


Hs.4210 


Homo sapiens cDNA FLJ13069 fis, clone NT 


446313 Z38465 


H06245 


Hs.106801 


ESTs, Weakly similar to PC4259 ferritin 


431342 238525 


AW971018 


Hs.21659 


ESTs 


433227 Z38538 


AB040923 


Hs.106808 


kelch (Drosophila)-like 1 


428306 Z38551 


AB037715 


Hs.1 83639 


hypothetical protein FLJ 10210 


424624 238783 


AB032947 


Hs.151301 


Ca2+dependent activator protein for seer 


125295 Z39113 


AB022317 


Hs.25887 


sema domain, immunoglobulin domain (Ig), 


125298 239255 


AW972542 Hs.289008 


Homo sapiens cDNA: FLJ21814 fis, clone H 


125300 239591 


239591 


Hs.101376 


EST 




BE622770 


Hs.264915 


Homo sapiens cDNA FLJ12908 fis clone NT 


444582 Z39920 


R55344 


Hs.22142 


cytochrome b5 reductase b5R.2 


130882 Z40166 


M497044 


Hs.20887 


hypothetical protein FLJ10392 


128888 240388 


AI760853 


Hs.241558 


ariadne (Drosophila) homolog 2 


125310 240646 


R59161 


Hs.124953 


ESTs 


125315 241697 


R38110 


Hs.106296 


ESTs 


125317 299349 


299348 


Hs.1 12461 


ESTs, Weakly similar to I38022 hypotheti 


135096 299394 


AA081258 




zinc finger protein 36 (KOX 18) 
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TABLE 3A 



Table 3A shows the accession numbers for those pkeys lacking unigenelD's for Table 3. The pkeys in Table 7 lacking unigenelD's are represented within 
Tables 1 -6A. For each probeset we have listed the gene cluster number from which the oligonucleotides were designed, Gene clusters were compiled 
using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 
Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers far sequences comprising each cluster are listed in the "Accession" 



Pkey: Unique Eos probeset identifier number 

CAT number: Gene cluster number 
Accession: Genbank accession numbers 



108469 116761J AA079487 AA128547 M128291 AA079587 AA079600 
124106 125446J H12245 AA094769 R14576 
108501 13684_-12 AA083256 

108562 36375J AA100795 AF020589 M074629 AA075946 AA1C0849 AA085347 AA126309 AA079311 AA079323 AA085274 

101300 4669J BE535511 M62098 M306787 AW891766 AA348998 AA338869 AA344013 AW956561 AW389343 AW403607 L40391 

AW408435 AA121738 AI568978 H13317 R20373 AW948724 AW948744 AA335023 AA436722 AA448690 C21404 
AW884390 AA345454 AA303292 AA174174 BE092290 T90614 AA035104 R76028 AA126924 AA741086 AW022056 
AW1 18940 AA121666 AI832409 AA683475 AI140901 AI623576 AW519064 AW474125 AI953923 AI735349 AW150109 
AI436154 AW118130 AW270782 AI804073 N27434 AA876543 AA937815 AI051166 AA505378 AI041975 AI335355 
AI089540 AA662243 AI127912 AI925604 AI250880 AI366874 AI564386 AI815196 AI683526 AI435885 AI160934 H79030 
AI801493 AA448691 AI673767 AI076042 AI804327 AA813438 AA680002 AI274492 T16177 AI287337 AI935050 
AA907805 AA91 1493 AI58941 1 AI371 358 AW576236 AI078B66 AW51 61 68 AA346372 AI5601 85 AA471009 R75857 
AA296025 AA523155 AA853168 AI696593 AI658482 AI566601 AW072797 AA128047 AAD35502 AW243274 AA992517 
R43760 

132091 94851J AW954243 AA829930 AA412478 AA828434 AA814538 AI927418 AI192435 W52897 AA443666 AA031913 AI683306 

AA918481 AI183314 D83907 Ai206B32 AA876122 D83836 D83838 D82533 AI761290 A1191125 AI143749 AW771909 
AI241436 AI767267 W56507 AA847787 AA568692 T10502 AI247870 AA715017 AA643304 AA890233 AA81 1387 
AA897470 AA907729 AI708679 AI07801 0 AA452830 AW419160 AI783713 N80205 W56778 AA676899 AI88871B N69930 
AI338935 AI217580 AA639508 AA575836 BE04S852 AI312651 AI038405 AA628649 AA643838 AI493761 AA032024 
W38849 AA340178 AA447052 AA452969 W19359 AA2S6364 H44229 W58767 C05751 C05835 AI741989 N98532 
AW102617 AA412583 AI922246 W38495 AA355375 AA928571 C05275 AA352500 N93132 

117034 20113_2 U72209 NM_005748 AI655607 AI052758 AA385199 AW955794 H88679 AL135153 AI7S5644 AA3B4399 AW966458 

AA568443 AA804610 AI873513 H88639 Z25371 R63456 W44919 

100752 33207_21 T81309 BE019033 R94181 BE019198 NMJ00B12 J03242 AW411299 BE300064 BE297544 R94182 AW630108 T53723 

D58853 H78073 H80594 BE299560 T48899 H70196 M17426 N77077 S77035 H58384 H61664 H78540 T84527 C17198 
H60255 H71980 R92644 W79050 X0091O M29645 R91055 M17863 M17862 T71815 BE299561 BE464561 X06260 
R94741 T54216 C18594 BE262015 X06151 AW409889 AA378400 BE263228 BE313278 R88116 BE313457 H43500 
T48617 BE313761 H77309 AI207601 X06159 H40413 X03425 T87663 R10627 X03562 M141 18 W03982 R97520 H81229 
T83157 H83168 H48762 AA669898 BE263054 H47289 M022807 R1 1555 H74260 R76968 R28338 H72534 H72464 
H62031 N72478 N45355 AW411300 R89113 R69135 H58454 T83281 R93476 H69645 H68015 T82229 H71089 T85121 
H59939 W65299 N78176 H53909 N72373 R21788 H04660 H59639 H61874 BE262219 T53614 N73335 N50464 W00943 
N77189 R89257 AA570502 R89432 R06366 AA553480 AA776271 AA551359 AA551050 H51670 M601052 BE299081 
HS8198 H52276 BE207832 N91 192 H70332 X07868 X07868 H69464 H53782 H73710 R80435 M553384 AW884176 
N53475 T71662 AW954036 AW954033 AA552931 H93205 AA430218 M553476 AI918470 T54124 BE207982 BE300177 
N73994 AW882625 N39549 N53838 AA722389 H71878 H58909 H37849 H78435 T47933 R77174 R83814 AA411890 
H94199 M663208 BE205778 AA490137 H70492 R98232 H37800 AA679294 H40341 H74238 H47290 H73231 T48618 
AA025428 AI039521 H92969 N593S9 H8053E H72933 T90530 M411891 N55000 H74225 M340290 AW957061 T54316 
AA340437 H57125 H58908 H7S027 H63450 N74623 R93425 H68714 H68758 N68396 H48763 N69256 H57320 H53831 
H53589 N68833 N52453 H56048 H69870 H78074 R69253 R83375 T53615 H94330 H58455 H908S4 T47934 H74261 
R89258 R97997 R91056 R28339 R86760 H78235 R97521 H67692 H40358 AA022688 H52513 H59801 T88690 H65255 
H63397 W65397 AA553588 R19280 N52645 W73930 R06367 R21743 H72372 N73921 AW883539 AW882639 T40616 
H47084R95723AA634316AA862781 H77310 R91389 H93111 R92767 T54512 R89341 H70333H57817 H82941 H62032 
N52638 H58385 T91798 H5106S M340292 T49918 H81230 R36121 N5041 1 T87664 N62436 N39340 AA665637 
AA340446 H93377 H92973 BE296290 BE269788 H61665 AA340444 N54605 AA454101 R10628 R94200 AI200549 
AA342640 BE298855 BE250229 T49916 H82008 N28278 AW880662 H71268 N76791 H47685 H65255 W05198 
AW889144 N76677 H71702 H58036 H71915 R91612 R87807 H68059 AI133328 AI247866 AA621443 AW881050 
AA700847 AA340413 AW878608 AW881181 AW878249 H71916 N54596 BE161581 AW878082 W04212 AW881040 
AW885492 AW880519 AA334387 AW878715 W06882 AW630222 AW885381 H70869 AW381778 H47601 AW889982 
H63868 AW884986 AW878713 AW878685 R36391 AW878694 AA368070 C03393 AW878695 AW878705 AW878665 
AW878742 AW878620 AW878823 AW878688 R29048 AW878690 AW878686 AW87B810 AW878827 AW878733 
AW878659 AW878749 AW878681 AW383353 AW883277 AW883300 AW883565 AW883298 AW883143 AW883045 
AW883482 AW883352 AW88341 7 AW883357 AW883231 AW883474 AW883355 AW882620 AW882533 AW883754 
AW883139 AW882827 AW883641 AW833567 AW883481 AW882983 AW882982 AW882465 AW883419 AW882466 
AW883639 AW883230 AW882981 AW382534 AW882874 AW882619 AW883480 AW882826 AW882831 AW882835 
AW882830 AW883563 AW882456 AW527642 

116417 5418J 1 AW499664 AW500888 AL042095 AW576556 AW265424 A1521500 AA761333 AA761319 AW291137 AA649040 AA769094 

AA489664 AA63531 1 AW070509 M425658 AI381489 AA609309 AA134476 W74704 AI923640 AW084888 H45700 
AI985564 AW629495 AW614573 AI859571 AI693486 AA913892 AI806164 AA909524 AW263513 AI356361 Z40708 
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125020 116017J 



125104 413347 1 

124575 1666649J 

125263 1547_2 

131859 3672J 



AI332765 AI392620 AA181060 AW1 18719 AW968804 AW263502 AW505314 AA036967 W74741 R51 139 H19364 H45751 

Z44962 AW370823 H25650 T54007 AA453000 AL045739 

AA609684AA758732 

W73853 AA9281 12 W77887 AW889237 AA148524 AI749182 AI754442 AI338392 AI253102 AI079403 AI370541 AI697341 
H97538 AW188021 AI927669 W72716 AI051402 A1188071 AI335900 N21488 AW770478 W92522 AI691028 AI913512 
A1144448 W73819 AA604358 N28900 W95221 AI868132 H98465 AA148793 
T91251 T64891 T85665 



AA098878 W88942 

AW960564 AA092457 T55890 D56120 T92525 A1815987 BE182608 BE182595 AW080238 M90657 AA347236 AW961685 
AW176446 AA304671 AW583735 T61714 AA316968 AI446615 AA343532 AA083489 AA488005 W52095 W39480 N57402 
D82638 W25540 W52847 D82729 D58990 BE519182 AA315188 AA308636 AA1 12474 W76162 AA088544 H52265 
AA301631 H80982AA1 13786 BE620997 AW651B91 AA343799 BE613669 BE547180 BE546656 F11933 AA376800 
AW239185 AA376086 BE544387 BE619041 AA452515 AA001806 AA190873 M180483 AA159546 F00242 AI940609 
AI940602 AH 89753 T97663 T661 1 0 AW062896 AWC62910 AW062902 AI051622 AI828930 AA102452 A1685095 AI819390 
AA557597 AA383220 AI804422 AI633575 AW338147 AW603423 AW506800 AW750567 AW510672 AI250777 AA083510 
AW629109 AW513200 AA921353 AI677934 AI148693 AI955858 AA173825 AA453027 AI027865 AW375542 AA454099 
AA733014 AI591384 R79300 R80023 M843108 M626053 AA84489B AW375550 AA889018 AI474275 AW205937 
AI052270 AW3881 1 7 AW3881 1 1 AA699452 AI242230 N47476 H38178 AA3S6621 AA113196 AA130023 H39740T61629 
AI885973 AW083671 AA179730 AA305757 AI285455 N83956 AA216013 AA336155 AW999959 T97525 AA345349 
T91762 AA771981 AI285092 AI591386 BE392485 BE385852 AA682601 A1682884 AA345840 T85477 M292949 
AA932079 AA098791 D82607 T48574 AW752038 C06300 
R20840 R20839 

M30269 NMJ02508 X82245 AI078760 AW957003 D78945 M27445 AA650439 AL048816 AV660256 AV660347 
AA333052 BE295257 T60999 AA383049 AW369S77 Z26985 AW175704 AA343326 AW747957 AI818389 W17308 
W17302 H15591 AA371284 M370412 W94966 BE384365 T28498 R80714 R16959 H21723 AW835154 D56097 D56381 
W21232 AA1 90565 AW379755 AW067895 

AW136928 AI685655 BE218584 BE465078 N68963 AA975338 BE147199 N76377 

BE273749 BE397561 BE387189 AL037858 AL037878 AI963094 BE259216 AA01 1363 AL036189 BE562325 AA251 169 
BE617431 N98537AA1 58093 AL047800 M34539 NMJ00801 M312140 D16971 AA15B904 AA307114 AA312803 
T09203 AW629686 AL048504 BE388578 AA22D957 AA158364 BE267385 AA294971 C18055 BE241757 AA115056 
AI936769 BE378435 BE206971 AW674924 BE622060 AA604674 AA115273 AW402159 AA338608 BE568819 M80199 
X55741 AA375111 AA376016 BE512671 AA805742 AW405588 N25850 N44580 H06031 AW403549 BE536552 AA056726 
BE543239 M082517 AI201 545 AI201 642 A1192622 N40104 AA370921 BE547569 A19B9S02 AA302038 AI197890 
AW268354 AI014938 W45448 AI541395 AA037272 BE538826 AL039613 BE536130 AA299355 AW805147 AW974624 
H53220 AI471471 AA399303 AA0D738B W35106 BEB13277 R12739 R12738 AA304342 AA687802 BE409581 AI498844 
AV662092AW904105AAQ11375 BE315214H99302BE537893 N32299AW855829AI291320BE078322AI301395 
AA303362 N32719 AA358328 AA357877 AI95254D H56279 H02758 H02048 AW805233 R82224 M410772 AA291352 
BE171103 N69935 BE16924B .AA361 173 H44978 BE617887 D52560 M084043 W03595 R67219 N36477 N42924 R67104 
H44901 H79695 W21105 AA393988 W30899 AA316096 BE622896 W46872 AA442678 BE544893 BE540112 BE621873 
AA338067 N55052 BE398154 BE621210 AA740760 C03739 C03206 BE396692 AA482370 AA031614 AA301575 
AA304710 AA132153 M029796 AA994960 H19567 AA442969 H49781 H46871 AA035395 AA055185 AA149378 
AA643080 AL135479 AA292329 AA654337 AA04122B AA454888 AA025039 W58331 AA625981 T94941 AA302448 
H 19900 M218956 AA513790 AA563962 AA398076 W44441 AA293276 W47373 AA625879 W30688 AA043029 T642B4 
R79151 AA304340 AA485186 AA604939 R82470 AA421425 AW771456 AI339329 AA304424 AA605236 AA936934 
AA587673 AI209162 AI697301 AI479995 AI679814 AI361950 AW189125 AI955888 AI986019 BE301019 A1084792 
AI310211 AW189307AI022070AW977204A1146825 AW190163 AW303281 AI828345 BE048043 AW029257 M482268 
AI246507 AI420729 AW084932 AW439514 AI890487 AW439692 AI523896 AI186612 AI659953 AI889773 AA687527 
AW072694 AW262153 AW467371 AI613269 AI679238 D54404 M158103 AW105527 AW149739 AW150361 AW268387 
AW1 17708 AI951682 AI687440 AW674285 AA678365 AI587082 AA732095 AA019899 W45661 AA627300 BE613304 
AA765891 M612935 AI814658 AW316916 R66594 AA514640 AA025040 AA031472 AW732076 AA029797 AI244560 
AI128734 AW381720 AI092360 AI263283 AW613175 AI890675 AI720156 AW631348 AI635106 AI278045 AA303979 
AA703505 W45449 AW078661 AI292052 AW381707 AI147854 AW381743 AA158905 AA303258 AA888144 AW195967 
AA428706 AA989559 AA517731 H19882 BE543418 AA830386 AA421302 W58652 T94995 AI869743 AI679145 
AW085971 N98425 AA765136 A1347027 AI356955 AA928038 AI679717 AA458459 AA679281 AI357973 AI270041 
AA765135 AA732793 AI798447 AA668646 AA251008 AI9B4538 AI401737 AA056186 BE043308 AW662375 AI3021 10 
N50724 W96332 BE537047 N26983 AI557172 AA765296 AW673237 N29784 AA534275 M084044 AW067973 
AW300766 T63393 W46823 R39790 AI364185 AW298582 AA454814 AW069878 N67751 H05982 N23140 AI362647 
AI302086 AI767772 N25755 H53114 AA706133 T93511 M429291 AA935294 M987647 W02803 R66595 AI680795 
W23673 AW440794 AA722872 H49538 AW131042 AA531603 AA908665 AA040791 AA235312 W52205 N93444 R82180 
H02759 H79696 AW088894 H55079 AA961 1 43 AW067776 AW973745 AA01631 1 AW071227 AA01751 1 AI753994 
W47374 T64155 AA296092 AI698626 AA558158 AA296088 AW794259 H01963 AA149267 AA485076 AA975856 H44938 
AA035396 AI955555 H46289 AA4861 61 AI631222 AA359047 AW794253 AI806962 AW243930 AA526145 AW878734 
AA018464 AA132031 R67220 R79152 AA296093 H54300 AI005160 BE242548 AW992803 AW878644 AW878666 T27742 
R82471 AW517604AW472738 AI282904 R39791 AA486098 AW467891 AW960520 AA551736 AA056621 AW945197 
R66373 AA554236 BE2422D2 AI904376 A1832590 H19484 R00890 AI627677 AA302287 AI869451 AI734855 AI708073 
AI832902 AA585184 AW204299 M0555B5 D12417 D1 1975 T63543 AW664099 R54423 BE612712 T96340 T63985 
AA598917 T40735 T64053 AA1492B4 AW272548 AA363445 AA042893 AW300697 BE261973 T53501 T53500 AW878729 
AW878657 AW794391 AA069193 R01553 H44875 AA385406 M533968 M93060 AL135600 W96331 AA017651 
AA018849 AA017692 H85337 BE278690 AA731 598 AA018512 AI076813 AI022644 R02585 X52220 AW296894 M825671 
AI699321 AI393601 AW592611 AI146747 AA608921 AA158365 AW590007 AA354519 D20081 R02704 AW798339 
M92422AA094903AA007676 
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135096 
103767 



40 103855 84277J 



45 126872 
113026 



50 

120284 
112540 
111904 
121094 

55 128510 
114106 
121335 
120761 
122050 

60 130018 
100104 
121822 



1605263J 
1719336J 
275729J 



279548J 
224903J 
273507_2 



70 107832 
123523 
123533 
132225 
125017 

75 125063 
125064 



110682J 
111495J 

genbank_AA021473 
genbank_AA608588 
genbank_AA608751 
genbank_AA1 28980 
genbank_T68875 T68875 
genbank_T85352 T85352 
genbank_T85373 T85373 



AI352558 Z82248 X78138 NM_003405 AU077248 AA223125 S80794 D78577 A1124697 AW403970 BE614089 BE296713 
BE621334 L20422 X80536 D54224 D54950 X57345 N29226 AA127798 AA340253 F08031 M192540 H67636 AA321827 
AW950283 AA084159 BE538808 AW401377 AA256774 C03366 W46595 W47608 AA305009 H69431 H69456 AL120082 
H11706 AA303717 AA361357 H22042 H78020 AW999584 AA134368 AA322911 AA322961 H60980 N85248 N31547 
H79624 T11718 W85826 AW894663 AW894624 BE167441 BE170015 AA304626 AW602163 AW998929 AA156681 
AA151067 BE002724 AA608688 H82692 BE155392 AW383636 BE155394 AA487004 AW383504 AI342365 R82553 
W16498 BE155344 AI143938 R69901 AA322873 AW340648 R25364 AA367935 AI559406 AA033522 AA374252 
AW835019 AI922133 AI697089 N99662 AW189078 Al 199076 AW1 51 598 W59944 AA662875 W94022 AA299055 
AI039008 AI829449 AA583503 A1635674 AW131665 AI473820 AW2731 18 AW900930 AA908944 AI688035 AW170272 
AI082545 AW468176 AI608761 AI082743 A191 1682 A/243943 AI831016 AA192465 AI218477 AA938406 AA385288 
AI809817 AA905196 AI191245 AI470204 AI188295 AI421367 AI125315 AI087141 AA629032 AA740589 AI554181 
AA150830 AI248541 AI077943 AA775958 AA864930 AI261476 AI123121 AI310394 AA862331 AA872478 BE537084 
AI205606 AA720684 AI872093 AW150042 AL1 20538 AA219627 AA988608 C21397 AI359337 H25337 AI089749 
AA605146 AI359620 AA150478 AI359738 AW383642 AW995424 AI766457 R56892 AI089839 W61343 N69107 W46459 
AA565955 N20527 AI279782 W46596 AA776573 H23204 AI866231 AI083995 N21530 AA126874 D82630 W65437 
AI086917 AW382095 AI086877 H39844 AW340217 W85827 L08439 AA262704 AA505380 W47413 W94135 AA223241 
AW089153 AA084101 BE538000 AA0961 25 T28031 AA491 574 R84813 AA774536 AW383522 AA155615 AW383529 
AA491520 AW026427 AA171496 AI469689 AW664539 AI81 1 102 AI81 11 16 BE464590 BE350791 H78021 T15405 H21979 
AA219489 H13301 AA505883 AI864305 AI423963 AW084401 F04963 R69858 H67097 A1917740 AI655561 H69864 
AA033631 AW383484 AI886261 H25293 AA51 3281 AW271 187 H1 1617 N79982 AI174338 AI904207 AI904208 BE614558 
W94127 W65436 AI272249 AA700018 AI579932 AI085941 AW152629 

AA334551 BE008229 M307537 AW961156 AW995894 AW995826 NMJ06751 M61 199 AA045603 AL036372 AV645606 
AI688095 AW351901 AA101337 AA101345 N73342 BE01B030 BE569044 AW841975 AA373388 BE090412 H95440 
N53845 R67867 AA093441 AA363427 H93708 AW023134 AW994986 AW994989 BE090429 R23614 AI567932 H03726 
H01 101 H01867 AA548743 AI671806 AW872949 AW872941 AA742447 AI199788 AA045604 AI637465 AI741796 
AW242217 AW131463 AI765302 AI683923 AA889762 AI804889 A1986437 C06049 BE502340 AI695651 AI491970 
AA496804 AA281008 M665699 AI473814 BE301445 AA707837 AA551925 AI017348 AI208185 AA775203 M156296 
AA557463 H95441 AA768547 AW769358 M991197 AA181954 AI091389 AI147289 AW771837 AI638582 AA844411 
AI374750 T29320 AW951272 AW085923 H02834 AA843259 AA814696 AW183290 AA158453 N68125 N69039 AA100423 
AA101346 AI918720 H01102 R67868 H01868 N66438 R46580 AI858433 AA599560 AA187577 AA157481 AA361520 
AL047827AA158452R21688AW964874AA325161 R40871 AW752395 AW375924 R1 3355 AA281 174 AA428908 
AA081258 AA160311 W17034 H8359S Z99393 AI831206 AW771108 AW769214 N89775 AW161495 AW161522 
AW160880 Z99394AI814820 

BE244667 BE241813 BE242271 AA381943 NM.016040AF151858AW967497 AW966873 AI824386 AW470133 
AW015765 BE018650 AW503659 AI129838 A1632346 AA013099 AW770511 BE219482 AI824135 AI867379 AA019318 
AA285143 AW087624 AI990100 AA251084 AI633952 AA287714 AA400773 AI292112 AW469095 M743312 AW117423 
AA694551 AA885657 M1 12675 BE327333 AA082161 H03613 AA094735 AW500235 N28878 AA287713 AW300233 
AA826249 N46921 BE348728 AW505056 AW966879 AI521202 AA393405 AI264668 AA910851 AA251721 AI470834 
H03503 AA089688 R58562 BE004728 AA668793 H27167 R54717 

W02363 N80298 AA304486 AW954799 AW805135 AW970817 AW373398 AW875459 AA136805 AA683501 N73299 
AW341082AI632954AA493369AI478433AI037911 AW272169 AW043832 AA010683 AW629090 AW183622 N64510 
AW079953 AI554533 AA563670 AA010682 AW237610 AW419057 AI470926 AI627833 AA195080 AA195179 AI471443 
AW590266 AI168477 AW771214 AI767341 AW340086 AW748455 AI280D79 AI244821 AI381283 AW300130 AW183374 
AW1 95397 AA136706 AI824598 AW573004 Z98448 AA905255 AI497883 

AW450979 AA136653 AA135656 AW419381 AA984358 AA492073 BE168945 M8D9D54 AW238038 BE01 1212 BE011359 
BE011367 BE011368 BE011362 BE011215 BE011365 BE011353 

AA376654 W76367 AA318232 AI694545 AI742403 AI887383 AW204731 AW874431 BE220997 AA114979 AA303838 
AI002267 AW952031 W74801 AA011287 AA1151 12 AI306385 R37677 AW571707 R59986 W94102 AW197042 H10206 
AW139819 AI686172 AI674165 R51633 AI367086 T23948 H10833 H23002 H11743 R37085 Z39208 H22794 H11820 
R13817 Z43122 H10257 R88398 R18795 AA010848 R67191 H10875 R67170 
AA179656 AA182626 AA182603 
R69751 R70467 H69771 H80879 H80878 
Z41572 R39330 
AA402505 M398900 

X94703 NM 004249 R52316 T87420 N46403Z36855 BE076834 
AW602528 BE073359 Z38412 
AA404418 AI217248 
AA321890 R18000 

AI453076 AI376075 AI014836 AA628633 AA961066 AI150282 AI028574 AI217182 AA732910 AA431478 AL041229 

AA353093 AW957317 AW872498 AI560785 AI2891 10 AW135512 X97261 T68873 

AF008937 

AI743860 N49543 AW027759 BE349467 AI656284 BE463975 R35022 AA370031 AW955302 AL042109 N53092 AI61 1424 
AL079362 AI969290 AI928016 BE394912 BE504220 BE467505 AI61161 1 AI611407 AI611452 W56437 AI284566 
AI583349 AW1 83058 AI308085 AI074952 AA437315 AA628161 AW301728 AI150224 AA400137 M437279 AI223355 
AA639462 AI261373 AI432414 AI984994 AI539335 AA401550 AA358757 A1609976 AA442357 AA359393 AA437046 
AA370301 AA429328 AW272055 A1580502 AI832944 AI038530 AA425107 AI014988 AI148349 AW237721 AW779756 
AW137877 AM 25293 AA400404 R28554 
AA065069AA085108 

AA069818 AA069971 AA069923 AA069908 
AA021473 
AA608588 
AA608751 
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111514 
104534 
120340 



124254 
101447 
101458 
101667 



124576 
108931 
108941 
124720 
124793 
124799 
103138 
117683 
124991 
103432 
119174 



AA206828 



genbank_T91518 T91518 
entrez_J00212 J00212 
entrez_U30245 U30245 
NOT FOUND_entrez W38240 
genbank_C13961 C13961 
genbank_N55493 N55493 
genbank_N57493 N57493 
entrez U51010 U51010 
genbank_N63520 N63520 
genbank_N66845 N66845 
genbank N68905 N68905 
genbank_R07998 R07998 
R22303 at R22303 
genbank_AA206828 
genbank_AA227469 
genbank_AA027317 
genbank_AA235050 
genbank_AA302809 
genbank_AA346495 
genbank_AA348913 
genbank T97307 T97307 

304084J AI583948 AA578212 AW303715 AA653450 AA456981 A1400385 W88533 AI224133 AW272145 AA088686 R94698 
ganbank_W84768 W84768 
genbank_AA452156 AA452156 
genbank_AA454085 AA454085 
genbank_AA064859 
genbank_AA075374 
genbank_AA464414 
genbank_AA076382 
genbank_AA078986 

231290J AW411259 H23555 AW015049 AI684275 AW015886 AW068953 AW014085 AI027260 R52686 AA918278 AI129462 

AA969360 N34369 AI948416 AA534205 M702483 M705292 
genbankJ\A084415 AA084415 
genbank_H69899 H69899 
entrez_M21305 M21305 
entrez J/122092 M22092 

13349_1 NM_O05381 M60858 AW373732 AW373724 AW373689 AW373629 AW373609 AW373776 AA187806 AW386946 

AW374207 T05235 AA21 6203 AW385556 AA306940 AA305526 AA31 5461 AL036757 AW37371 1 AW4031 24 AW403640 
AW377084 T27360 H62638 F06957 AW377051 AA554779 AA378568 AA096007 AW352407 AW302637 F07929 H17433 
AW382712 H05665 F07292 N39875 AA089729 H62556 N42842 R12952 AW373735 AW364155 AA056183 W39185 
AW382708 N32488 AF114C96 AW375993 AI1335S9 W52561 AA603040 AA133710 AI928796 AW176370 AA827519 
AW338437 AA521 1 42 T29341 AI800461 AW317002 AA703914 AA860830 AI859203 AI445772 AA714334 AI817066 
A1832027 AW510442 AI635B02 AW08B3C6 AW06B672 AV/408555 AW467542 AA552657 AA152367 W32081 AA582124 
AA074040 AA931657 AI051154 AW410203 AI921544 H17434 AIB32330 AW404836 AI925038 AA088423 AA954166 
AA580453 AW021292 AI267215 AW080082 AW383778 AI933053 AI919097 W31557 N90245 AA931591 AA563995 
F36352 M056184 AA476294 AA641327 AA533550 AI749630 W58323 AA569119 AA508573 AI809050 AI37B996 
AA41 1 362 AW407505 AA938104 AA074041 AA632876 AW193748 AA507873 AI270128 AI472365 AA411353 AI523216 
AI719965 AI816302 AA182681 AI707990 AA133588 AI758537 W60253 AI460308 AA135423 AI083904 F04188 N89693 
AW408776 AI678595 AI270568 AA722059 W58234 F33650 AA090547 AA285108 AA425981 N85079 D20218 AI273980 
AA159028 F03226 AW247914 N26918 AW272741 N90109 H05666 N23327 AW247953 R44748 AA962015 F0355B 
AI752394 AW409913 AW248396 AI816463 AI752393 AA325370 AA263089 AI570130 AI971951 AI160658 AI357360 
AW168686 AL121075 AW050536 N21672 W67748AA514242AI127386 H14607 AI185752 W79364AA088520 AA152476 
AW351940 AW373683 AI940524 AW374953 T56500 N24329 AI940720 AW374933 AW374947 AW391913 AL138337 
AW376241 AW062943 F26666 AW410202 AW062958 F34529 AW381807 AW393315 W17147 AW176359 AA664576 
AW380424 AA306040 AI745674 AW300951 AI1 88579 AI438973 AI305271 AA433818 AA612807 AI831809 AI940409 
AA1 58663 AI572988 

genbank_N68201 N68201 

genbank M147186 AA147186 

genbank_AA148650 AA148650 

144582_1 R05283 R11056 

genbank R44519 R44519 

genbank_R45088 R45088 



genbank_N40180 N40180 
genbank T50116 T50116 
entrez_X97748 X97748 
genbank_R71234 R71234 
95573 2 T11483T11472 

11235J AW247252AA346143NM_000270AA381085 N91995X00737AA381079AA296473AA296110AA315735M311617 

AA326750 AA376804 AW403290 T95231 M1 3953 T47963 HB2039 AA279899 AA627997 N76320 N99527 H37842 
W20095 AA457308 AW469547 AA724143 H83220 AA319496 W86334 W30892 R89169 R99427 N41854 H47286 
AA348094 AA045089 R63016 AI922219 AID24906 AI096488 AI885005 AA194872 N90489 AI452544 H7241 1 AA282427 
AA430735 R68963 R22453 H70385 AW129369 AW467320 AW51 9082 AA345018 AA582183 AI961789 R65918 N3061 1 
AI979189 AI280889 AW273191 R66531 AI285845 AI675927 AI421990 AW190879 H37794 AA699667 H68427 AA954388 
AI188757 AI140048 AA430382 AI204151 AW247864 AA559099 AI431420 AA548276 AI149466 AA772669 AA694388 
AA7241 68 AA301 651 AA281 952 AA779925 AA234760 W86290 AA91 3603 AW51 1745 AI500697 M814922 AA835040 
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T47964 H53998 M975804 R98710 AI077604 N70252 R98084 AW250171 H69268 AI597614 AA970746 AA972548 
AI377116 R62962 H16737 R89070 AA731329 R66532 N54354 AI81B832 H81944 N71567 T95122 W86463 AA437095 
AI431999 AI915724 N63851 AI674743 AA457307 AA21 1475 N64444 AI799146 H72853 R99335 H60413 AA770367 
AA156105 AI269937 H64029 H89723 R65819 AW470496 AI873318 AI735713 H82987 C02447 AI478666 T27651 
AI699770 AW025156 H69719 AI934717 N69225 AI459856 AA953577 AI424691 H13843 R22404 AI873796 AI336002 
N70898 AI420854 AA541792 AA346142 AI000814 AI828348 AA045090 T51257 N90434 H13890 N73184 AI708083 
M781606 AA329050 M339985 R68964 H64795 W04186 H16845 



119416 
119558 
119559 
119654 
121350 
121558 
105985 
114648 
121895 
100327 
123315 
123473 



genbank_T97186 T97186 
NOT_FOUND_entrez_W38194 
NOT_FOUND_entrezJ«38197 
genbank_W57759W57759 



entrezLD55640 D55640 
714071J AA496369AA496646 
genbank_AA599143 AA5 



genbank_AA405237 
genbank_AA412497 
genbank_AA406610 
genbank_AA101056 
genbank_AA427396 



W38194 
W38197 



M405237 
AA412497 
AA406610 
AA101056 
AA427396 



AA599143 
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Unique Eos probeset identifier number 
Accession number used for previous patent filings 
Exemplar Accession number, Genbank accession number 
Unigene number 



Unigene Title: Unigene gene title 



ExAccn UniGene UnigeneTitle 



100405 
100420 
100481 
100484 
100718 



101168 
101194 
101261 
101345 
101447 
101485 
101543 
101550 
101560 
101674 
101714 
101741 
101838 
101857 
102012 



D86425 
D86983 

HG1098-HT1098 
HG1103-HT1103 
HG3342-HT3519 
J03764 
L06797 
L15388 
L20971 
L35545 
L76380 
M21305 
M24736 
M31166 
M31551 



D86983 Hs.1 18893 
X70377 Hs.121489 
NM_005402Hs.288757 
BE295928 Hs.75424 



M61916 
M68874 
M74719 



U03057 
U03877 
U 18300 
U27109 
U31384 



40 102564 U59423 

102663 U70322 

102759 U81607 

102778 U83463 

102804 U89942 

45 102887 X04729 

102898 X06256 

102915 X07820 

103036 X54925 

103037 X54936 



103158 
103166 
103185 
103280 
103554 



104465 
104592 
104764 
104786 
104850 



104974 
105178 
105263 
105330 
105376 
105729 
105826 
105977 



X79981 

Z18951 

AA187101 

N24990 

R81003 

AA025351 

AA027168 

AA040465 

AA045136 

AA054087 

AA071089 

AA085918 

AA1 87490 

AA227926 

AA234743 



AA398243 
AA406363 
AA411465 
AA412284 



D30857 Hs.82353 
NM_005795Hs.152175 
M21305 

AA296520 Hs.89546 
M31166 Hs.2050 
Y00630 Hs.75716 
AW958272 Hs.347326 
NM_002291Hs.82124 
M68874 Hs.211587 
NM_003199Hs.326198 
BE243845 Hs.75511 
BE550723 Hs.1 53179 
BE259035 Hs.1 18400 
AA301867 Hs.76224 
NMJ00107HS.77602 
NMJ07351HS.268107 
AW161552 Hs.83381 
U33053 Hs.2499 
U59423 Hs.79067 
NM_002270Hs.168075 
NM_005100Hs.788 
AF000652 Hs.8180 
NM_002318Hs.83354 
J03836 Hs.82085 
NMJ)02205Hs.149609 
X07820 Hs.2258 
M13509 Hs.83169 



NMJ05424HS.78824 
BE242587 Hs.1 18651 
AA159248 Hs.180909 
NM_006825Hs.74368 
U84722 Hs.76206 
AI878826 Hs.74034 
AA187101 Hs.213194 
Z44203 Hs.26418 
AW630488 Hs.25338 
AI039243 Hs.278585 
AA027167 Hs.10031 
AL133035 Hs.8728 
T79340 Hs.22575 
AF065214 Hs.18858 
AW076098 Hs.345588 
Y12059 Hs.278675 
AA313825 Hs.21941 
AW388633 Hs.6682 
AW338625 Hs.22120 



v-ra! simian leukemia viral oncogene horn 
inhibitor of DNA binding 1, dominant neg 
serine (or cysteine) proteinase inhibito 
chemokine (C-X-C motif), receptor 4 (fus 
G protein-coupied receptor kinase 5 
phosphodiesterase 4B, cAMP-specific (dun 
protein C receptor, endothelial (EPCR) 
calcitonin receptor-like 
gb:Human alpha satellite and satellite 3 
selectin E (endothelial adhesion molecul 
pentaxin-related gene, rapidly induced b 
serine (or cysteine) proteinase inhibito 
intercellular adhesion molecule 2 
laminln, beta 1 

phospholipase A2, group IVA (cytosolic, 
transcription factor 4 
connective tissue growth factor 
fatty acid binding protein 5 (psoriasis- 
singed (Drosophila)-like (sea urchin fas 

damage-specific DNA binding protein 2 (4 
multimerin 

guanine nucleotide binding protein 11 

protein kinase C-like 1 

MAD (mothers against decapentaplegic, Dr 

karyopherin (importjn) beta 2 

A kinase (PRKA) anchor protein (gravin) 

syndecan binding protein (syntenin) 

lysyl oxidase-like 2 

serine (or cysteine) proteinase inhblto 
integrin, alpha 5 (fibronectin receptor, 
matrix metalloproteinase 10 (stromelysin 
matrix metalloproteinase 1 (interstitial 
placental growth factor, vascular endoth 
tyrosine kinase with immunoglobulin and 
hematopoietically expressed homeobox 
peroxiredoxin 1 

transmembrane protein (S3kD), endoplasmi 
cadherin 5, type 2, VE-cadherin (vascula 
caveolin 1, caveolae protein, 22kD 
hypothetical protein MGC10895 
ESTs 



H46612 Hs.293815 
AA478756 Hs.194477 
AK001972 Hs.30822 
AB033888 Hs.8619 
X64116 Hs.171844 
H93366 Hs.7567 



KIAA0955 protein 

hypothetical protein DKFZp434G171 
B-cell CLUIymphoma 6, member B (zinc fi 
phospholipase A2, group IVC (cytosolic, 
desmoplakin(DPI.DPII) 
bromodomain-containing 4 
AD036 protein 

solute carrier family 7, (cationic amino 
ESTs 

hypothetical protein FU10849 

Homo sapiens HSPC285 mRNA, partial cds 

E3 ubiquitin ligase SMURF2 

hypothetical protein FLJ11110 

SRY (sex determining region Y)-box 1 8 

Homo sapiens cDNA: FLJ22296 fis, clone H 

Homo sapiens cDNA: FLJ21962 fis, clone H 
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106155 
106302 
106423 
106793 
107174 
107216 
107295 
107385 
108756 



109456 /> 
109768 
110107 
110906 
110984 
111006 
111018 
111133 
111760 
113073 
113195 



AA425309 

AA435896 

AA448238 

AA478778 

AA621714 

D51069 

T34527 

U97519 

AA127221 

AA1 32983 

AA1 35606 

AA155125 

AA1 79845 



115145 
115819 
115947 
116314 



T33637 
T57112 
W80763 
AA046808 
AA253217 
AA255991 
AA258138 
AA426573 
AA443793 
AA490588 
9 AA496257 



116733 F 
117023 h 
117186 y 
117563 l> 
117997 f 
118475 l> 
118581 l> 
119073 
119155 
119174 
119221 
119416 
119866 
121335 
121381 
123160 
123473 
123523 
123533 



R61715 
R71234 
R98105 
T97186 
W80814 
AA404418 
AA405747 



AA608588 
AA608751 
C139S1 



3 N93521 

3 N95477 

r R60044 

5 R70506 

1 T91518 

3 T95333 



126563 W26247 



126872 AA136653 

456000 AA136653 

75 414221 AA136653 

127402 AA358869 



AA425414 Hs.33287 

M398859 Hs.18397 

AB020722 Hs.16714 

H94997 Hs.16450 

BE1 22762 Hs.25338 

D51069 Hs.211579 

M1 86629 Hs.80120 
NM_005397Hs.16426 

AA1 27221 Hs. 11 7037 

AL1 17452 Hs.44155 

AA135606 Hs.189384 

AI056548 Hs.72116 

AA219691 Hs.73625 



F06838 Hs. 14763 
AW151660 Hs.31444 
AA035211 Hs. 17404 
AW613287 Hs.80120 
BE387014 Hs.166146 
AI287912 Hs.3628 
AW580939 Hs.97199 



N39342 Hs.103042 
H83265 Hs.8881 
AW953484 Hs.3849 
AW1 39036 Hs. 108957 
AI751438 Hs.41271 
AI683069 Hs.175319 
AA740907 Hs.88297 
AA486620 Hs.41135 
R47479 Hs.94761 
AI799104 Hs.178705 
AK000290 Hs.44033 
AK001531 Hs.66048 
AI557212 Hs.17132 
AL157424 Hs.61289 
AW070211 Hs.102415 
H98988 Hs.42612 
AF055634 Hs.44553 
N52090 Hs.47420 
N66845 
N68905 

BE245360 Hs.279477 E 
R61715 Hs.310598 E 
R71234 
C14322 Hs.250700 tr 
T97186 

AA496205 Hs.1 93700 
AA404418 

AW088642 Hs.97984 
AA488687 Hs.284235 



nuclear factor l/B 

hypothetical protein FLJ23221 

Rho guanine exchange factor (GEF) 1 5 



melanoma cell adhesion molecule 

UDP-N-acetyl-_lpha-D-galactosamine:polyp 

podocalyxin-like 

ESTs 

DKFZP586G1517 protein 

gb:zl10a05.s1 Soares_pregnant_uterus_NbH 

hypothetical protein FLJ20992 similar to 

RAB5 interacting, kinesin-like (rabkines 

ESTs 

ESTs 

ESTs 

ESTs 

UDP-N-acetyl-alpha-D-galactosamine:polyp 
Homer, neuronal immediate early gene, 3 
mitogen-actlvated protein kinase kinase 
complement component C1q receptor 
Homo sapiens cDNA FLJ1 1949 fis, clone HE 
microtubule-associated protein 1B 
ESTs, Weakly similar to S41044 chromosom 
hypothetical protein FLJ22041 similar to 
40S ribosomal protein S27 isoform 
Homo sapiens mRNA full length insert cDN 
ESTs 
ESTs 

endomucin-2 
KIAA1691 protein 

Homo sapiens cDNA FLJ1 1333 fis, clone PL 
dipeptidyl peptidase 8 
hypothetical protein FU10669 
ESTs, Moderately simitar to [54374 gene 
synaptojanin 2 

Homo sapiens mRNA; cDNA DKFZp586N0121 (f 
ESTs, Weakly similar to ALU1_HUMAN ALU S 



EST 



AA608751 

AI147155 Hs.270016 
NM_005402Hs.288757 

AI680737 Hs.289068 

AI571594 Hs.1 02943 

W07701 Hs.304177 

AI887664 Hs.285814 
T91518 

AA570056 Hs.1 22730 

R60547 Hs.1 70098 
R20840 

R23858 Hs.143375 

R23858 Hs.143375 

T92143 Hs.57958 

BE247550 Hs.86859 

AA516391 Hs.181358 

AA001860 Hs.279531 

AA001860 Hs.279531 



BE180876 Hs.11614 
AW450979 

AA358869 Hs.227949 



gb:ye50h09.s1 Soares fetal livi 
Homo sapiens mRNA; cDNA DKFZp586l0324 (f 
gb:zw37e02.s1 Soares_total_fetus_Nb2HF8_ 
hypothetical protein FLJ22252 similar to 
ESTs, Weakly similar to I38022 hypotheti 
gb;ae52d04.s1 Stratagene lung carcinoma 
gb:ae54e06.s1 Stratagene lung carcinoma 
gb:ae56h07.s1 Stratagene lung carcinoma 
gb:C13961 Clontech human aorta polyA+mR 
ESTs 

v-ral simian leukemia viral oncogene horn 

Homo sapiens cDNA FLJ1 1918 fis, clone HE 

hypothetical protein MGC12916 

Homo sapiens clone FLB8503 PR02286 mRNA, 

sprouty (Drosophila) homolog 4 

gb:ye20f05.s1 Stratagene lung (937210) H 

ESTs, Moderately similar to KIAA1215 pro 

KIAA0372 gene product 

gb:yg05c08.r1 Soares infant brain 1NIB H 

Homo sapiens, done IMAGE:3B40937, mRNA, 

Homo sapiens, clone IMAG&3840937, mRNA, 

EGF-TM7-latrophilin-related protein 

growth factor receptor-bound protein 7 

U5 snRNP-specific protein (220 kD), orth 

ESTs 

ESTs 

gb:UI-H-BI3-ala-a-12-0-UI,s1 NCI_CGAP.Su 
HSPC065 protein 

gb:UI-H-BI3-ala-a-12-0-Ul.s1 NCI_CGAP_Su 
SEC13(S.cerevisiae)-H<e1 
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127651 


AI123976 


AA382523 Hs.1 05689 


MSTP031 protein 


424806 


AI123976 


AA382523 Hs.1 05689 


MSTP031 protein 


128062 


AA37950D 


AA379621 Hs.1 05547 


neural proliferation, differentiation an 


128992 


R49693 


H04150 Hs.107708 


ESTs 


129046 


AA195678 


AB029290 Hs.1 08258 


actin binding protein; macrophin (microf 


129188 


M30257 


NM_001078Hs.109225 


vascular cell adhesion molecule 1 


129314 


AA028131 


BE622768 Hs.290356 


mesoderm development candidate 1 


129371 


M10321 


X06828 Hs.1 10802 


von Willebrand factor 


129468 


J03040 


AW410538 Hs.111779 


secreted protein, acidic, cysteine-rich 


129765 


M86933 


M86933 Hs.1238 


amelogenin (Y chromosome) 


129805 


AA012933 


AA012848 Hs.12570 


tubulin-specific chaperone d 


129884 


AA286710 


AF055581 Hs.13131 


lysosomal 


130495 


AA243278 


AW250380 Hs.109059 


mitochondrial ribosomal protein L12 


130639 


D59711 


AI557212 Hs.17132 


ESTs, Moderately similarto I54374 gene 


130657 


T94452 


AW337575 Hs.201591 


ESTs 


130828 


AA053400 


AW631469 Hs.203213 


ESTs 


130972 


AA370302 


D81866 Hs.21739 


Homo sapiens mRNA; cDNA DKFZp586l1518 (f 


131080 


J05008 


NM_001955Hs.2271 


endothelin 1 


131137 


U85193 


W27392 Hs.33287 


nuclear factor l/B 


131182 


AA256153 


AI824144 Hs.23912 


ESTs 


131486 


X83107 


F06972 Hs.27372 


BMX non-receptor tyrosine kinase 


131573 


AA046593 


AA040311 Hs.28959 


ESTs 


131647 


AA410480 


AA359615 Hs.30089 


ESTs 


131756 D45304 


AA443966 Hs.31595 


ESTs 


131859 


M90657 


AW960564 


transmembrane 4 superfamily member 1 


131881 


AA010163 


AW361018 Hs.3383 


upstream regulatory element binding prot 


132050 


AA136353 


AI267615 Hs.38022 


ESTs 


132083 


Y07867 


BE386490 Hs.279663 


Pirin 


132164 


U84573 


AI752235 Hs.41270 


procollagen-lysine, 2-oxoglutarate 5-dio 


132358 


X60486 


NM 003542HS.46423 


H4 histone family, member G 


132413 


AA132969 


AW361383 Hs.260116 


metalloprotease 1 (pitrilysin family) 


132456 


AA1 14250 


AB011084 Hs.48924 


KIAA0512gene product; ALEX2 


132490 


F13782 


NM_001290Hs.4980 


LIM domain binding 2 


132676 


M283035 


N92589 Hs.261038 


ESTs, Weakly similar to I38022 hypotheti 


132687 


AB002301 


AB002301 Hs.54985 


KIAA0303 protein 


132718 


AA056731 


NM 004600HS.554 


Sjogren syndrome antigen A2 (60kD, ribon 


132736 


U68019 


AW081883 Hs.211578 


Homo sapiens cDNA: FLJ23037 fis, done L 


132760 


H99198 


AA125985 Hs.56145 


thymosin, beta, identified in neuroblast 


132933 


AA598702 


BE263252 Hs.6101 


hypothetical protein MGC3178 


132968 


N77151 


AF234532 Hs.31638 


myosin X 


132994 


AA505133 


AA1 12748 Hs.279905 


clone HQ031OPRC0310p1 


133061 


AB000584 


AI186431 Hs.296638 


prostate differentiation factor 


133147 


D12763 


AA026533 Hs.66 


interleukin 1 receptor-like 1 


133161 


AA253193 


AW021103 Hs.6631 


hypothetical protein FLJ20373 


133200 AA432248 


AB037715 Hs.1 83639 


hypothetical protein FLJ10210 


133260 


AA083572 


AA403045 Hs.6906 


Homo sapiens cDNA: FLJ23197 fis, clone R 


133363 


AA479713 


AI866286 Hs.71962 


ESTs, Weakly similarto B36298 prollne-r 


133491 


L40395 


BE619053 Hs.1 70001 


eukaryotio translation initiation factor 


133517 X52947 


NM_000165Hs.74471 


gap junction protein, alpha 1, 43kD (con 


133550 


W80846 


AI129903 Hs.74669 


vesicle-associated membrane protein 5 (m 


133607 


M34539 


BE273749 


FK506-binding protein 1A (12kD) 


133614 


D67029 


NM 003003HS.75232 


SEC14(S. cerevisiae)-like 1 


133627 


U09587 


NM_002047Hs.75280 


glycyl-tRNA synthetase 


133691 


M85289 


M85289 Hs.211573 


heparan sulfate proteoglycan 2 (periecan 


133696 


D10522 


AI878921 Hs.75607 


myristoylated alanine-rich protein kinas 


133913 


W84712 


AU076964 Hs.7753 


calumenin 


133975 


D29992 


C18356 Hs.295944 


tissue factor pathway inhibitor 2 


133985 


L34657 


L34657 Hs.78146 


platelet/endoihelial cell adhesion molec 


134039 


S78569 


NM_002290Hs.78672 


laminin, alpha 4 


134088 


D43636 


AI379954 Hs.79025 


KIAA0096 protein 


134161 


U97188 


AA634543 Hs.79440 


IGF-II mRNA-binding protein 3 


134299 


AA487558 


AW580939 Hs.97199 


complement component C1q receptor 


134416 


M28882 


X68264 Hs.211579 


melanoma cell adhesion molecule 


116470 


X70683 


AI272141 Hs.83484 


SRY (sex determining region Y)-box 4 


134656 


X14787 


A1750878 Hs.87409 


thrombospondin 1 


134989 


AA236324 


AW968058 Hs.92381 


nudix (nucleoside diphosphate linked moi 


135051 


C15324 


AI272141 Hs.83484 


SRY (sex determining region Y)-box 4 


135073 


AA452000 


W55956 Hs.94030 


Homo sapiens mRNA; cDNA DKFZp586E1624 (f 


135349 


D83174 


AA1 14212 Hs.9930 


serine (or cysteine) proteinase inhibito 


100114 


D00596 


X02308 Hs.82962 


thymidylate synthetase 


100130 


D11428 


NM 000304HS.1 03724 


peripheral myelin protein 22 ~' 


100143 


D13640 


AU076465 Hs.278441 


KIAA0015 gene product 


100168 


D14874 


H73444 Hs.394 


adrenomedullin 


100208 


D26129 


NM_002933Hs.78224 


ribonuclease, RNase A family, 1 (pancrea 


100224 D28476 


AL121516 Hs.138617 


thyroid hormone receptor interactor 12 


100405 D86425 


AW291587 Hs.82733 


nidogen 2 
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100420 
100455 
100529 
100618 



101097 
101110 

15 101142 
101156 
101168 
101184 
101192 

20 101317 
101336 
101345 
101400 
101475 

25 101485 
101496 
101505 
101543 
101557 

30 101560 
101587 
101592 
101633 
101634 

35 101667 
101682 
101714 
101720 
101741 

40 101744 
101793 
101837 
101838 
101840 

45 101857 
101864 



D87953 

HG1862-HT1897 
HG2614-HT2710 
HG2639-HT2735 
HG2855-HT2995 
HG3044-HT3742 
HG3342-HT3519 
HG3543-HT3739 
HG4069-HT4339 
HG417-HT417 
J03764 



L12711 
L13977 
L15388 
L19871 



L76380 
M15990 
M23254 
M24736 
M26576 
M27396 
M31166 
M31994 
M32334 
M35878 
M36429 
M57730 
M57731 
M60858 
M62994 
M68874 
M69043 
M74719 
M75126 
M84349 



D86983 Hs.1 18893 Melanoma associated gene 



50 102013 
102024 
102059 
102121 
102283 

55 102300 
102378 



M93056 
M94856 
M95787 
S76965 
S81914 
U03057 
U03100 
U03877 
U08021 
U14391 
U31384 
U32944 
U40369 
5 U41767 
3 U48959 

1 U51010 
9 U51478 

3 U53445 
0 U59289 

4 U59423 
9 U62015 
0 U63825 

5 U67963 
7 U73379 

3 U73824 
9 U77604 
9 U81607 

4 U89942 

2 X04412 
7 X06985 

5 X07820 



102960 X15729 



AW888941 Hs.75789 
BE313693 Hs.334330 
AI752163 Hs.1 14599 
N24433 Hs.241567 
U56725 Hs.180414 
X02761 Hs.287820 
BE295928 Hs.75424 
f 81 309 

AL048753 Hs.303649 
AA836472 Hs.297939 
J03836 Hs.82085 
BE245301 Hs.89414 
AI439011 Hs.86386 
L12711 Hs.89643 
AA340987 Hs.75693 
NM_005308Hs.211569 
NM_001674Hs.460 
BE247295 Hs.78452 
L42176 Hs.8302 
NM_006732Hs.75678 
NM_005795Hs.152175 
M15990 Hs.194148 
BE410405 Hs.76288 
AA296520 Hs.89546 
X12784 Hs.1 19129 
AA307680 Hs.75692 
M31166 Hs.2050 
BE293116 Hs.76392 
AW958272 Hs.347326 
AI752416 Hs.77326 
AF064853 Hs.91299 
NM_004428Hs.1624 
AV650262 Hs.75765 
NM_005381 
AF043045 Hs.81008 
M68874 Hs.211587 
M69043 Hs.81328 
NMJ03199HS.326198 
AI879352 Hs.1 18625 
W01076 Hs.278573 
M92843 Hs.343586 
BE243845 Hs.75511 
AA236291 Hs.183583 
BE550723 Hs.153179 
BE392588 Hs.75777 
NM_006823Hs.75209 
X96438 Hs.76095 
BE259035 Hs.1 18400 
BE616287 Hs.178452 
AA301867 Hs.76224 
AI752666 Hs.76669 
NM_004998Hs.82251 
AW161552 Hs.83381 
AI929721 Hs.5120 
AU076887 Hs.28491 
AU077005 Hs.92208 
U48959 Hs.211582 
U51010 

BE243877 Hs.76941 
U53445 Hs.15432 
R97457 Hs.63984 
U59423 Hs.79067 
AU076728 Hs.8867 
AI984144 Hs.66713 
AL1 19566 Hs.6721 
NM_007019Hs.93002 
AA532780 Hs.183684 
AA122237 Hs.81874 
NM_005100Hs.788 
. NM_002318Hs.83354 
AI767736 Hs.290070 



N-myc downstream regulated 
calmodulin 2 (phosphorylase kinase, delt 
collagen, type VIII, alpha 1 
RNA binding motif, single stranded inter 
heat shock 70kD protein 2 
fibronectin 1 

inhibitor of DNA binding 1, dominant neg 
insulin-like growth factor 2 (somatomedi 
small inducible cytokine A2 (monocyte ch 
cathepsin B 

serine (or cysteine) proteinase inhibito 

chemokine (C-X-C motif), receptor 4 (fus 

myeloid cell leukemia sequence 1 (BCL2-r 

transketolase (Wernicke-Korsakoff syndro 

prolylcarboxypeptidase (angiotensinass C 

G protein-coupled receptor kinase 5 

activating transcription factor 3 

solute carrier family 20 (phosphate tran 

four and a half LIM domains 2 

FBJ murine osteosarcoma viral oncogene h 

calcitonin receptor-like 

v-yes-1 Yamaguchi sarcoma viral oncogene 

calpain 2, (m/ll) large subunit 

selectin E (endothelial adhesion moiecul 

collagen, type IV, alpha 1 



pentaxin-related gene, rapidly induced b 
aldehyde dehydrogenase 1 family, member 
intercellular adhesion molecule 2 
insulin-like growth factor binding prote 
guanine nucleotide binding protein (G pr 
ephrin-A1 
GR02 oncogene 
nuoleolin 

filamin B, beta (actfn-binding protein-2 
phospholipase A2, group IVA (cytosolic, 
nuclear factor of kappa light polypeptid 
transcription factor 4 



CD59 antigen p18-20 (antigen identified 
zinc finger protein homologous to Zfp-36 
connective tissue growth factor 
serine (or cysteine) proteinase inhibito 
fatty acid binding protein 5 (psoriasis- 
transgelin 

protein kinase (cAMP-dependent, catalytJ 
immediate early response 3 
singed (Drosophila)-like (sea urchin fas 
catenin (cadherin-associated protein), a 
EGF-containing fibulin-iike extracellula 
nicotinamide N-methyltransferase 
myosin IE 

guanine nucleotide binding protein 1 1 
dyneln, cytoplasmic, light polypeptide 
spermidine/spermine N 1 -acetyltransferase 
a disintegrin and metalloprotelnase doma 
myosin, light polypeptide kinase 
gb:Human nicotinamide N-methyltr 
ATPase, Na+/K+ transporting, beta 3 poly 
downregulated in ovarian cancer 1 
cadherin 13, H-cadherin (heart) 
MAD (mothers against decapentaplegic, Dr 
cysteine-rich, angiogenic inducer, 61 
hepatitis delta antigen-interacting prat 
lysosomal 

ubiquitin carrier protein E2-C 
eukaryotic translation initiation factor 
microsomal glutathione S-transferase 2 
A kinase (PRKA) anchor protein (gravin) 



X07820 Hs.2258 
BE512730 Hs.65114 
AI904738 Hs.76053 



gelsolin (amyloidosis, Finnish type) 
heme oxygenase (decycling) 1 
matrix metalloproteinase 10 (stromelysin 
keratin 18 

DEAD/H (Asp-Glu-Ala-Asp/His) box polypep 
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103011 X52541 

103020 X53416 

103029 X54489 

103036 X54925 

5 103056 X57206 

103080 X59798 

103095 X60957 

103138 X65965 

103176 X69111 

10 103195 X70940 

103347 X87838 

103371 X91247 

103432 X97748 

103471 Y00815 

15 103967 AA303711 

104447 L44538 

104764 AA025351 

104783 AA027050 



104877 M047437 
104894 AA054087 
104952 AA071089 
105113 AA156450 

25 105178 AA187490 
105196 AA195031 
105215 AA205724 
105263 AA227926 
105271 M227986 

30 105330 AA234743 
105461 AA253216 

105492 AA256210 

105493 AA256268 
105594 AA279397 

35 105727 AA292379 
105732 AA292717 
105767 AA346551 
105882 AA400292 
105936 AA404338 

40 106031 AA412284 
105124 AA423987 
106222 AA428594 
106241 AA430108 
106263 AA431462 

45 106264 AA431470 
105366 AA443756 
106454 AA449479 
106634 AA459916 
106724 AA465226 

50 105793 AA478778 
106799 AA479037 
105842 AA482597 
106868 AA487561 
106890 AA489245 

55 106961 AA504110 
106974 AA520989 
107030 AA599434 
107061 AA608649 
107086 AA609519 

60 107216 D51069 
107385 U97519 
107444 W28391 
107985 AA035638 
108507 AA083514 

65 108S95 AA121315 
108931 AA147186 
109001 AA156125 
109195 AA188932 
109390 AA219653 

70 109456 AA232645 
109737 F10078 
110411 H48032 
110660 H82117 
110906 N39584 

75 111018 N54067 
111091 N59858 



AJ243425 Hs.326035 early growth response 1 
X53416 Hs.195464 filamin A, alpha (actin-binding protein- 
AW800726 Hs.789 GR01 oncogene (melanoma growth stimulati 
M13509 Hs.83169 matrix metalloproteinase 1 (interstitial 
Y18024 Hs.78877 inositol 1 ,4,5-trisphosphate 3-kinase B 
AU077231 Hs.82932 cyclin D1 (PRAD1; parathyroid adenomatos 
NM_005424Hs.78824 tyrosine kinase with immunoglobulin and 
X65965 gb:H,sapiens SOD-2 gene for manganese su 

AL021 1 54 Hs.76884 inhibitor of DNA binding 3, dominant neg 
AA351647 Hs.2642 eukaryotic translation elongation factor 
AU077309 Hs.171271 catenin (cadherin-associated protein), b 
X91247 Hs.13046 thioredoxin reductase 1 
X97748 gb:H.sapiens PTX3 gene promotor region, 

Y00815 Hs.75216 protein tyrosine phosphatase, receptor t 
AL120051 Hs.144700 ephrin-B1 
AW204145 Hs.156044 ESTs 
AI039243 Hs.278585 ESTs 

AA533513 Hs.93659 protein disulfide isomerase related prot 

AW952619 Hs.17235 Homo sapiens done TCCCIA00176 mRNA sequ 

T79340 Hs.22575 B-cell CLL/lymphoma 6, member B (zinc fi 

AI138635 Hs.22968 Homo saoiens clone IMAGE:451939, mRNA se 

AF065214 Hs.18858 phospholipaseA2, group IVC (cytosolio, 

AW076098 Hs.345588 desmoplakin (DPI, DPIl) 

AB037816 Hs.8982 Homo sapiens, clone IMAGE:3506202, mRNA, 

AA313825 Hs.21941 AD036 protein 

W84893 Hs.9305 angiotensin receptor-like 1 

AA205759 Hs.10119 hypothetical protein FU14957 

AW388633 Hs.6682 solute carrier family 7, (cationio amino 

AA807881 Hs.25329 ESTs 

AW338625 Hs.22120 ESTs 

BE539071 Hs.69388 hypothetical protein FLJ20505 

AI805717 Hs.2891 12 CGI-43 protein 

AL047586 Hs.10283 RNA binding motif protein 8B 



AL135159 Hs.20340 KIAA1 002 protein 

AW504170 Hs.274344 hypothetical protein MGC12942 

AW370946 Hs.23457 ESTs 

W46802 Hs.81988 disabled (Drosophila) homolog 2 (mitogen 

AI678765 Hs.21812 ESTs 

X64116 Hs.171844 Homo sapiens cDNA: FLJ22296 fis, clone H 

HS3365 Hs.7567 Homo sapiens cDNA: FLJ21962 fis, clone H 

AA356392 Hs.21321 Homo sapiens clone FLB9213 PR02474 mRNA, 

BE019681 Hs.6019 Homo sapiens cDNA: FLJ21288 fis, clone C 

W21493 Hs.28329 hypothetical protein FLJ 14005 

AL046859 Hs.3407 protein kinase (cAMP-dependent, catalyti 

AA186715 Hs.336429 RIKEN cDNA 9130422N19 gene 

NM_014038Hs.5216 HSPC028 protein 

W25491 Hs.288909 hypothetical protein FU22471 

N48670 Hs.28631 Homo sapiens cDNA: FLJ22141 fis, clone H 

H94997 Hs. 16450 ESTs 

BE313412 Hs.7961 Homo sapiens clone 25012 mRNA sequence 

AF124251 Hs.26054 novel SH2-containing protein 3 

BE185536 Hs.301183 molecule possessing ankyrin repeats indu 

AA489245 Hs.88500 mitogen-activated protein kinase 8 inter 

AW243614 Hs.18063 Homo sapiens cDNA FLJ10768 fis, clone NT 

AI817130 Hs.9195 Homo sapiens cDNA FLJ 13698 fis, clone PL 

AL1 17424 Hs.25035 chloride intracellular channel 4 

BE147611 Hs.6354 stromal cell derived factor receptor 1 

NrVL012331Hs.26458 methionine sulfoxide reductase A 

D51069 Hs.211579 melanoma cell adhesion molecule 

NM_005397Hs.16426 podocalyxin-like 

W28391 Hs.343258 proliferation-associated 2G4, 38kD 

T40054 Hs.71968 Homo sapiens mRNA; cDNA DKFZp564F053 (fr 

AI554545 Hs.68301 ESTs 

AB029000 Hs.70823 KIAA1077 protein 

AA14718S gb:zo38ti01.s1 Stratagene endothelial eel 

AI056548 Hs.721 16 hypothetical protein FLJ20992 similar to 

AF047033 Hs.132904 solute carrier family 4, sodium bicarbon 

AW007485 Hs.871 25 EH-domain containing 3 

AW956580 Hs.42699 ESTs 

AA055415 Hs.13233 ESTs, Moderately similar to A47582 B-cel 

AW001579 Hs.9645 Homo sapiens mRNA for KIAA1741 protein, 

AA782114 Hs.28043 ESTs 

AA035211 Hs.17404 ESTs 

AI287912 Hs.3628 mitogen-activated protein kinase kinase 

AA300067 Hs.33032 hypothetical protein DKFZp434N185 
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111356 N90933 
111378 N93764 
111741 R26124 
111769 R27957 
112318 R55470 
112951 T16550 
113057 T26674 
113195 T57112 
113490 T88700 
113542 T90527 
113803 W42789 
113847 W60002 
113910 W78175 
113947 W84768 
114047 W94427 
115061 AA253217 
115819 AA426573 
115870 AA432374 
115964 AA446622 
116228 AA478771 
116264 AA482594 
116314 AA490588 
116589 D59570 
117023 H88157 
117112 H94648 
117156 H97538 
117176 H98670 
117280 N22107 
119559 W38197 
119866 W80814 
120655 AA287347 
121314 AA402799 
121335 M404418 
121822 AA425107 
121835 AA425435 
122331 AA442872 
122577 AA452860 
123160 AA488687 
123486 AA599674 
124059 F13673 
124339 H99093 
124358 N22495 
124364 N23031 ■ 
124726 R15740 
124763 R39610 
125167 W45560 
125304 Z39833 
125307 Z40583 
125329 AA825437 
107985 R66613 
125598 R66613 
125609 AA368063 
116024 AA128075 
418000 AA128075 
126399 AA1 28075 
127435 N66570 
127566 AI051390 
127619 AA827122 
434190 AA627122 
128453 X02761 
128495 AF010193 
128515 AA149044 
128580 U82108 
128623 D78676 
128642 L35240 
128669 AA598737 
128903 R69417 
128914 M232837 
129087 N72695 
129188 M30257 
129226 M96843 
129265 X68277 
129345 AA292440 
129468 J03040 
129488 AA228107 
101838 AA449789 



BE301871 Hs.4867 
AW1 60993 Hs.326292 
AB020653 Hs.24024 
AW629414 Hs.24230 
AW083384 Hs.11C67 
AA307634 Hs.6650 
AW194301 Hs.339283 
H83265 Hs.8881 
BE178110 Hs.173374 
H43374 Hs.7890 



NMJ05032HS.4114 
AA1 13262 Hs.17901 
W84768 

AL035B58 Hs.3807 
AI751438 Hs.41271 
AA486620 Hs.41135 
NM_005985Hs.48029 
AA987568 Hs.74313 
AI767947 Hs.50841 
D51174 Hs.272239 
AI799104 Hs.178705 
AI557212 Hs.17132 
AW070211 Hs.102415 



W73853 

H45100 Hs.49753 

M18217 Hs.172129 
W38197 

AA496205 Hs.193700 

AA305599 Hs.238205 

W07343 Hs.182538 
AA404418 
AI743860 

AB033030 Hs.300670 

AL133437 Hs.1 10771 

AA829725 Hs.334437 

M488687 Hs.284235 

BE019072 Hs.334802 

BE387335 Hs.283713 

H99093 Hs.343411 

AW070211 Hs.102415 

AF265555 Hs.250646 
NM_003654Hs.104576 

BE410405 Hs.76288 

AL137540 Hs.102541 

AL359573 Hs.124940 

AW580945 Hs.330466 

AA825437 Hs.58875 

T40064 Hs.71968 

T40064 Hs.71968 

AA868063 Hs.104576 

AA088767 Hs.83883 

AA932794 Hs.83147 

AA088767 Hs.83883 

X69086 Hs.286161 

AI051390 Hs.116731 

AA627122 Hs.163787 

AA627122 Hs.163787 

X02761 Hs.287820 
NM_005904Hs.100602 



Hs.1C 

U82108 Hs.101813 
BE076608 Hs.105509 
Z28913 Hs.102948 
W28493 Hs.180414 
AW150717 Hs.345728 
AW867491 Hs.107125 
AI348027 Hs.108557 
NM_001078Hs.109225 
BE222494 Hs.180919 
AA530892 Hs.171695 
R22497 Hs.1 10571 
AW410538 Hs.1 11779 
AW965728 Hs.54642 
5 Hs.75511 



mannosyl (alpha-1 ,3-)-glycoprotein beta- 
hypothetical gene DKFZp434A1114 
KIAA0846 protein 
ESTs 

ESTs, Highly similar to T46395 hypotheti 
vacuolar protein sorting 45B (yeast homo 
Human DNA sequence from clone RP1-187J1 1 
ESTs, Weakly simitarto S41044 chromosom 
Homo sapiens cDNA FLJ 1 0500 fis, clone NT 
Homo sapiens mRNA for KIAA1671 protein, 
chromosome 8 open reading frame 4 
plastin 3 (T isoform) 

Homo sapiens, clone IMAGE:3937015, mRNA, 
gb:zh53d03.s1 Soares_fetal_liver_spleen_ 
FXYD domain-containing ion transport reg 
Homo sapiens mRNA full length insert cDN 
endomucin-2 

snail 1 (drosophila homolog), zinc finge 

KIAA1265 protein 

ESTs 

lysosomal 

Homo sapiens cDNA FU11333 fis, clone PL 
ESTs, Moderately similar to I54374gene 
Homo sapiens mRNA; cDNA DKFZp586N0121 (f 
ESTs 
ESTs 

uveal autoantigen with coiled coil domai 
Homo sapiens cDNA: FU21409 fis, clone C 
Empirically selected from AFFX single pr 
Homo sapiens mRNA; cDNA DKFZp586l0324 (f 
hypothetical protein PRO2013 
phospholipid scramblase 4 
gb:zw37e02.s1 Soares_total_fetus_Nb2HF8_ 
metallothionein 1E (functional) 
KIAA1 204 protein 

Homo sapiens cDNA: FLJ21904 fis, clone H 
hypothetical protein MGC4248 
ESTs, Weakly similar to I38022 hypotheti 
Homo sapiens cDNA FLJ14680 fis, clone NT 
ESTs, Weakly similar to S64054 hypotheti 
DEAD/H (Asp-Glu-Ala-Asp/His) box polypep 
Homo sapiens mRNA; cDNA DKFZp586N0121 (f 
baculoviral IAP repeat-containing 6 
carbohydrate (keratan sulfate Gal-6) sul 
calpain2,(m/ll)largesubunit 
netrin 4 

GTP-binding protein 

ESTs 

ESTs 

Homo sapiens mRNA; cDNA DKFZp564F053 (fr 
Homo sapiens mRNA; cDNA DKFZp564F053 (fr 
carbohydrate (keratan sulfate Gal-6) sul 
transmembrane, prostate androgen induced 
guanine nucleotide binding protein-like 
transmembrane, prostate androgen induced 
Homo sapiens cDNA FLJ 1361 3 fis, clone PL 
ESTs 
ESTs 
ESTs 

fibronectln 1 

MAD (mothers against decapentaplegic, Dr 
type I transmembrane protein Fn14 
solute carrier family 9 (sodium/hydrogen 
CTL2 gene 

enigma (LIM domain protein) 
heat shock 70kD protein 8 
STAT Induced STAT inhibitor 3 
plasmalemma vesicle associated protein 
hypothetical protein PP1057 



lie 1 

inhibitor of DNA binding 2, dominant neg 
dual specificity phosphatase 1 
growth arrest and DNA-damage-inducible, 
secreted protein, acidic, cystelne-rtch 
methionine adenosyitransferase II, beta 
connective tissue growth factor 
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413731 AA449789 
129557 W01367 
129619 AA610116 
129627 AA258308 
129762 AA460273 
129884 AA286710 
130018 T68873 
130147 D63476 
130178 M62403 
130282 X55740 
130431 L10284 
130495 AA243278 
130553 AA430032 

130638 H16402 

130639 D59711 
130657 T94452 
130686 M431571 
130776 R79356 
130818 AA280375 
130840 Z49269 
130899 Z41740 
131002 AA121543 
131080 J05008 
131084 AA101878 
131091 T35341 
131107 N87590 
131182 AA256153 
131207 W74533 
131319 U25997 
131328 V01512 
131509 X56681 
131555 AA161292 
131564 AA491465 
131573 AA046593 
131692 D50914 
131756 D45304 
131859 M90657 
131909 W69127 
131915 AA316186 
132046 AA384503 
132050 AA136353 
132151 AA044755 
132164 U84573 
132187 AA058911 
132303 M620962 
132314 AA285290 
132358 X60486 
132398 R31641 
132421 AA489190 
132490 F13782 
132520 AA257993 
132546 M24283 
132510 AA443114 
132716 T35289 
132840 N23817 
132883 AA047151 
132968 N77151 
132989 AA480074 
132999 Y00787 
133071 T99789 
133076 W84341 
133099 L09209 
133147 D12763 
133149 T16484 
133161 AA253193 
133200 AA432248 
133220 X82200 
133260 AA083572 
133295 L00352 
133349 N75791 
133391 X57579 
133398 X02612 
133436 H44631 
133454 AA090257 
133478 X83703 
133491 L40395 



BE243845 Hs.75511 

AL045404 Hs.46366 

AA209534 Hs.284243 

T40064 Hs.71968 

AA453694 Hs. 12372 

AF055581 Hs.13131 
AA353093 

D63476 Hs.172813 

U20982 Hs.1516 



AW505214 Hs.155560 ci 
AW250380 Hs.109059 
AF062649 Hs.252587 
AW021276 Hs.17121 
AI557212 Hs.17132 
AW337575 Hs.201591 
BE548267 Hs.337986 
AF167706 Hs.19280 
AW190920 Hs.19928 
BE048821 Hs.20144 
AI077288 Hs.296323 
AL050295 Hs.22039 
NM_001955Hs.2271 
NM_017413Hs.303084 
AJ271216 Hs.22880 
BE620886 Hs.75354 
AI824144 Hs.23912 
AF104266 Hs.24212 
NM_003155Hs.25590 
AW939251 Hs.25647 
X56681 Hs.2780 
T47364 Hs.278613 
T93500 Hs.28792 
AA040311 Hs.28959 
BE559681 Hs.30736 
AA443966 Hs.31595 



connective tissue growth factor 
KIAA0948 protein 
tetraspan NET-6 protein 

Homo sapiens mRNA; cDNA DKFZp564F053 (fr 
tripartite motif protein TRIM2 
lysosomal 
metallothicnein 1 L 

PAK-interaoting exchange factor beta 
insijim-iike growth factor-binding prote 
5' nucleotidase (CD73) 
calnexin 

mitochondrial ribosomal protein L12 
pituitary tumor-transforming 1 
ESTs 

ESTs, Moderately similar to I54374 gene 
ESTs 

Homo sapiens cDNA FLJ10934 Us, clone OV 
cysteine-rich motor neuron 1 
hypothetical protein SP329 
small inducible cytokine subfamily A (Cy 
serum/glucocorticold regulated kinase 
KIAA0758 protein 
endothelin 1 

apelin; peptide ligand for APJ receptor 



AI267615 Hs.38022 
BE379499 Hs.173705 
AI752235 Hs.41270 
AA235709 Hs.4193 
BE177330 Hs.325093 
AF1 12222 Hs.323806 
NM 003542HS.46423 
AA876616 Hs.16979 
AW1 63483 Hs.48320 
NM_001290Hs.4980 
AA257992 Hs.50651 
M24283 Hs.168383 
AA1 60511 Hs.5326 
BE379595 Hs.283738 
BE218319 Hs.5807 
AA373314 Hs.5897 
AF234532 Hs.61638 
AA480074 Hs.331328 
Y00787 Hs.624 
BE384932 Hs.64313 
AW946276 Hs.6441 
W16518 Hs.279518 
AA026533 Hs.66 
AA370045 Hs.6607 
AW021103 Hs.6631 
AB037715 Hs.1 83639 
NMJ06074HS.318501 
AA403045 Hs.6906 
AI147861 Hs.213289 
AW631255 Hs.8110 
AW1 03364 Hs.727 
NM_000499Hs.72912 
BE294068 Hs.737 
BE547647 Hs.177781 
X83703 Hs.31432 
I Hs.170001 



GCN1 (general control of amino-acid synt 
ESTs 
latrophilin 
stanniocalcin 1 

v-fos FBJ murine osteosarcoma viral onco 
jun D prato-oncogene 
interferon, alpha-inducible protein 27 
Homo sapiens cDNA FLJ11041 fis, clone PL 
ESTs 

KIAA0124 protein 
ESTs 

transmembrane 4 superfamily member 1 

SCAN domain-containing 1 

ESTs, Highly similar to S94541 1 clone 4 

chromosome 14 open reading frame 4 

ESTs 

Homo sapiens cDNA: FLJ22050 fis, clone H 

procollagen-lysine, 2-oxoglufarate 5-dfo 

DKFZP58601624 protein 

Homo sapiens cDNA: FU21210 fis, clone C 

pinin, desmosome associated protein 

H4 histone family, member G 

ESTs, Weakly similar to A43932 mucin 2 p 

double ring-finger protein, Dorfin 

LIM domain binding 2 

Janus kinase 1 (a protein tyrosine kinas 

intercellular adhesion molecule 1 (CD54) 

amino acid system N transporter 2; porcu 

casein kinase 1, alpha 1 

GTPase Rab14 

Homo sapiens mRNA; cDNA DKFZp586P1622 (f 
myosin X 

hypothetical protein FU13213 
interleukin 8 

ESTs, Weakly similar to AF257182 1 G-pro 
Homo sapiens mRNA; cDNA DKFZp586J021 (fr 
amyloid beta (A4) precursor-like protein 
interleukin 1 receptor-like 1 
AXIN1 up-regulated 
hypothetical protein FLJ20373 
hypothetical protein FLJ10210 
Homo sapiens mRNA full length insert cDN 
Homo sapiens cDNA: FU23197 fis, clone R 
low density lipoprotein receptor (famili 
L-3-hydroxyacyl-Coenzyme A dehydrogenase 
inhibin, beta A (activin A, activin AB a 
cytochrome P450, subfamily I (aromatic c 
immediate early protein 
hypothetical protein MGC5618 
cardiac ankyrin repeat protein 
eukaryotic translation initiation factor 
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133510 
133517 
133526 
133538 
133562 
133584 
133590 
133617 
133651 
133671 
133678 
133681 
133722 
133730 



133859 
133889 
133960 



AA227913 

X52947 

M11313 

L14837 

M60721 

D90209 

T67986 

AA14831B 

U97105 

T25747 

K02574 

D78577 

X53331 

S73591 

X95735 

L16862 

U44975 

M97796 

U86782 

AA099391 

M19267 



133977 L19314 

134039 S78569 

134075 U28811 

134081 L77886 

134164 C14407 

134203 M60278 

134238 R81509 

134299 AA487558 



134343 D50683 

134381 U56637 

134403 M61199 

134416 M28882 

134493 X15183 

134553 S53911 

134817 U20734 



NMJ00165HS.74471 
AU077051 Hs.74561 
NMJ03257HS.74614 
M60721 Hs.74870 
D90209 Hs.181243 
T70956 Hs.75106 
BE244334 Hs.75249 
AI301740 Hs.173381 
AW503116 Hs.301819 
AW247252 
AI352558 

AW969976 Hs.279009 
BE242779 Hs. 179526 
BE410769 Hs.75873 
AW239400 Hs.76297 
BE616902 Hs.285313 
BE222494 Hs.180919 
U86782 Hs.178761 
U48959 Hs.211582 
M19267 Hs.77899 
C18356 Hs.295944 
AI125639 Hs.250666 
NM_002290Hs.78672 
NM_012201Hs.78979 
AL034349 Hs.79005 
AW245540 Hs.79516 
AA161219 Hs.799 
AA102179 Hs.160726 
AW580939 Hs.97199 
D86962 Hs.81875 
R70429 Hs.81988 
D50683 Hs.82028 
AI557280 Hs. 184270 
AA334551 

X68264 Hs.211579 
M30627 Hs.289088 
NM.001773HS.85289 



135073 AA452000 

135170 AA282140 

135196 J02854 

135348 AA442054 



D28235 Hs. 196384 

AW968058 Hs.92381 

AL1 36653 Hs.93675 

AK000967 Hs.93872 

AA876372 Hs.93961 

W27190 Hs.94 

W55956 Hs.94030 

T53169 Hs.9587 

C03577 Hs.9515 



p53-induced protein 

gap junction protein, alpha 1, 43kD (con 

alpha-2-macroglobulin 

tight junction protein 1 (zona occludens 

H2.0 (Drosophila)-like homeo box 1 

activating transcription factor 4 (tax-r 

clusterin (complement lysis Inhibitor, S 

ADP-ribosylation factor-like 6 interacB 

dihydropyrimidinase-like 2 

zinc finger protein 146 

nucleoside phosphorylase 

tyrosine 3-monooxygenase/tryptophan 5-mo 

matrix Gla protein 

upregulated by 1,25-dihydroxyvitamin D-3 
zyxin 

G protein-coupled receptor kinase 6 

core promoter element binding protein 

inhibitor of DNA binding 2, dominant neg 

26S proteasome-associated pad1 homolog 

myosin, light polypeptide kinase 

tropomyosin 1 (alpha) 

tissue factor pathway inhibitor 2 

hairy (Drosophila)-homolog 

laminin, alpha 4 

Golgi apparatus protein 1 

protein tyrosine phosphatase, receptor t 

brain abundant, membrane attached signal 

diphtheria toxin receptor (heparin-bindi 

Homo sapiens cDNA FU11680 fis, clone HE 

complement component C1q receptor 

growth factor receptor-bound protein 10 

disabled (Drosophila) homolog 2 (mitogen 

transforming growth factor, beta recepto 

capping protein (actin filament) muscle 

sperm specific antigen 2 

melanoma cell adhesion molecule 

heat shock 90kD protein 1, alpha 

CD34 antigen 

jun B proto-oncogene 

prostaglandin-endoperoxide synthase 2 (p 

nudix (nucleoside diphosphate linked moi 

decidual protein induced by progesterone 

KIAA1682 protein 

Homo sapiens mRNA; cDNA DKFZp667D095 (fr 
DnaJ (Hsp40) homolog, subfamily A, membe 
Homo sapiens mRNA; cDNA DKFZp586E1624 (f 
Homo sapiens cDNA: FLJ22290 fis, clone H 
myosin regulatory light chain 2, smooth 

" mma 1 (formerly subty 



WO 02/079492 



PCT/US02/04915 



TABLE 4A 

Table 4A shows the accession numbers for those pkeys lacking unigenelD's for Table 4, The pkeys in Table 7 lacking unigenelD's are represented within 
Tables 1-6A. For each probeset we have listed the gene cluster number from which the oligonucleotides were designed. Gene clusters were compiled 
using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 
Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



Pkey: Unique Eos probeset identifier number 

CAT number: Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT Number Accession 

100752 33207_21 T81309 BE019033 R94181 BE019198 NM_000S12 J03242 AW411299 BE300064 BE297544 R94182 AW630108 T53723 
D58853 H78073 H80594 BE299560 T48899 H70196 M17426 N77077 S77035 H58384 H61664 H78540 T84527 C17198 
H60255 H71980 R92644 W79050 X00910 M29845 R91055 M17863 M17862 T71815 BE299561 BE464561 X06260 
R94741 T54216 C18594 BE262015 X06161 AW409889 AA378400 BE263228 BE313278 R881 16 BE313457 H43500 
T48617 BE313761 H77309 A1207601 X06159 H40413 X03425 T87663 R10627 X03562 M14118 W03982 R97520 H81229 
T83157 H83168 H48762 AA669898 BE263054 H47289 AA022807 R1 1555 H74260 R76968 R28333 H72534 H72464 
H62031 N72478 N45355 AW411300 R891 13 R69135 H58454T83281 R9347S H69645 H68015 T82229 H71089T85121 
H59939 W65299 N78176 H53909 N72373 R21788 H04660 H59639 H61874 BE262219 T53614 N73335 N50464 W00943 
N77189 R89257 AA570502 R89432 R06366 AA553480 AA776271 M551359 AA551050 H51670 AA601 052 BE299081 
H68198 H52276 BE207832 N91 192 H70332 X07868 X07868 H69464 H53782 H7371 0 R80435 M553384 AW884176 
N53475 T71662 AW954036 AW954033 AA552931 H93206 AA430218 AA55347S A1918470 T54124 BE207982 BE300177 
N73994 AW882625 N39549 N53838 AA722389 H71878 H58909 H37849 H78435 T47933 R77174 R33814 AA411890 
H94199 AA663208 BE205778 AA4901 37 H70492 RS8232 H37800 AA679294 H40341 H74238 H47290 H73231 T48618 
M025428 AI039521 H92969 N59389 H80538 H72933 T90630 AA411891 N55000 H74225 AA340290 AW957061 T54316 
AA340437 H57125 H58908 H79027 H63450 N74623 R93425 H68714 H68758 N68396 H48763 N69256 H57320 H53831 
H53589 N68833 N52453 H56048 H69870 H78074 R69253 R83375 T53615 H94330 H58455 H90864 T47934 H74261 
R89258 R97997 R91056 R28339 R86760 H78235 R97521 H87692 H40358 AA022688 H52513 H59601 T88690 H65256 
H63397 W65397 AA553588 R19280 N52645 W73930 R06367 R21743 H72372 N73921 AW883539 AW882639 T4061B 
H47084 R95723 AA634316 AA852781 H77310 R91389 H93111 R92767 T54512 R89341 H70333 H57817 H82941 H62032 
N52638 H58385 T91796 H51086 AA340292 T49918 H81230 R36121 N50411 T87664 N62436 N39340 AA665637 
AA340446 H93377 H92973 BE296290 BE269788 H616B5 AA340444 N54605 AA454101 R10628 R94200 A1200549 
M342640 BE298855 BE250229 T49916 H82008 N28278 AW880662 H71268 N7S791 H47685 H65255 W05198 
AW889144 N76677 H71702 H68036 H71915 R91612 R87807 H68059 AI133328 AI2478B6 AA621443 AW881050 
AA700847 AA340413 AW878608 AW881 181 AW878249 H71916 N54596 BE161581 AW878082 W04212 AW881040 
AW885492 AW880519 AA334887 AW87871 5 W06882 AW630222 AW885381 H70869 AW381778 H47601 AW889982 
H63868 AW884986 AW878713 AW878685 R36391 AW878694 AA368070 C03393 AW878695 AW878705 AW878B65 
AW878742 AW878620 AW878823 AW878688 R29048 AW87869D AW878686 AW878810 AW878B27 AW878733 
AW878659 AW878749 AW878681 AW883353 AW883277 AW883300 AWB83565 AW883298 AW883143 AW883045 
AW883482 AW883352 AW883417 AW883357 AW883231 AW883474 AW883355 AW882620 AW882533 AW883754 
AW883139 AW882827 AW883641 AW883567 AW883481 AW882983 AW882982 AW882465 AW883419 AW882466 
AW883639 AW883230 AW882981 AW882534 AW882874 AW882619 AW883480 AW882826 AW882831 AW882835 
AW882830 AW883563 AW882456 AW627642 

117156 145392 1 W73853 AA928112 W77887 AW889237 AA148524 AI749182 AI754442 AI338392 AI253102 AI079403 AI370541 AI697341 
H97538 AW188021 AI927669 W72716 AI051402 AI188071 AI335900 N21488 AW77047B W92522 AI691028 AI913512 
AI144448 W73819 AA604358 N28900 W95221 AI868132 H98465 AA148793 

131859 3672J AW9S0564 AA092457 T55890 D56120 T92525 AI815987 BE182608 BE182595 AW080238 M90657 AA347236 AW961686 

AW176446 AA304671 AW583735 T61714 AA316968 AI446515 AA343532 AA083489 AA488005 W52095 W39480 N57402 
D82638 W25540 W52847 D82729 D58990 BE619182 AA315188 AA308636 AA112474 W76162 AA088544 H52265 
AA301631 H80982AA1 13786 BE620997 AW651691 M343799 BE613669 BE547180 BE546656 F11933 AA376800 
AW239185 AA376086 BE544387 BE619041 AA452515 AA001806 AA190873 M180483 AA159546 F00242 AI940609 
A1940602 AI189753 T97663 T661 10 AW062896 AW062910 AW052902 AI051622 AI828930 M102452 AI685095 AI819390 
AA557597 AA383220 AI804422 AI633575 AW338147 AW603423 AW606800 AW750567 AW510672 AI250777 AA083510 
AW629109 AW513200 AA921353 AI677934 AI148698 AI955858 AA173825 AA453027 AI027865 AW375542 AA454099 
AA733014 AI591384 R79300 R80023 AA843108 M626058 AA844898 AW375550 AA889018 AI474275 AW205937 
AI052270 AW388117 AW388111 AA699452 AI242230 N47476 H38178 AA356621 AA113196 M130023 H39740T61629 
AI885973 AW083671 AA179730 M305757 AI285455 N83956 AA216013 AA336155 AW999959 T97525 AA345349 
T91762 AA771981 AI285092 AI591386 BE392486 BE385852 M682601 AI682884 AA345840 T85477 AA292949 
AA932079 AA098791 D82607 T48574 AW752038 C06300 

125565 1704098J R20840 R20839 

133607 1227_6 BE273749 BE397561 BE387189 AL037858 AL037878 AI953094 BE259216 AA011363 AL036189 BE562325 AA251169 

BE617431 N98537 M158093 AL0478CO M34539 NMJ00801 M312140 D16971 AA158904 AA307114 AA312803 
T09203 AWS29686 AL048504 BE388578 AA220957 AA1583B4 BE267385 AA294971 C18055 BE241757 AA115056 
AI936769 BE378435 BE206971 AW674924 BE622060 AA604674 AA115273 AW402159 AA338608 BE568819 M80199 
X55741 M3751 1 1 M376016 BE612671 AA805742 AW405588 N25850 N44580 H06031 AW403549 BE536552 AA05S726 
BE543239 AA082517 AI201645 AI201642 AI192622 N40104 AA370921 BE547569 AI969602 AA302038 AI197890 
AW268354 AI014938 W45448 AI541395 AA037272 BE538826 AL039613 BE536130 AA299355 AW805147 AW974624 
H53220 AI471471 AA399303 AA007386 W35106 BE613277 R12739 R12738 AA304342 AA687802 BE409581 A1498844 
AV662092 AW904105 AA011375 BE315214 H99302 BE537893 N32299 AW855829 AI291320 BE078322 AI301395 
AA303362 N32719 AA358328 M357877 AI952540 H56279 H02758 H02048 AW805233 R82224 AA410772 AA291352 
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13893J 



126872 142696J 

121335 279548J 

130018 18986J 

121822 244391J 



BE171109 N69935 BE169248 AA361173 H44978 BE617887 D52560 AA084043 W03595 R67219 N36477 N42924 R67104 
H44901 H79695 W21 105 AA393988 W30899 AA31 6096 BE622896 W46872 AA442678 BE544893 BE540112 BE621873 
AA338067 N55052 BE398154 BE621210 AA740760 C03739 C03206 BE396692 AA482370 AA031614 AA301575 
AA304710 M132153 AA029796 AA994960 H19567 AA442969 H49781 H46871 AA035395 AA056185 AA149378 
AA643080 AL135479 AA292329 AA654337 AA041228 AA454888 AA025039 W58331 AA625981 T94941 AA302448 
H19900 AA218956 AA513790 AA563962 AA398076 W44441 AA293276 W47373 AA625879 W30688 M043029 T64284 
R79151 AA304340 AA485186 AA604939 R82470 AA421425 AW771456 AI339329 AA304424 AA605236 AA936934 
AA587673 AI209162 A1697301 AI4799G5 AI679814 AI361950 AW189125 AI955888 AI986019 BE301019 AI084792 
AI310211 AW189307 AI022070 AW977204 AI146825 AW190163 AW303281 AI828345 BE046043 AW029257 AA482268 
AI246507 AI420729 AW084932 AW439514 AI890487 AW439692 AI523896 AI186612 AI659953 AI889773 AA687527 
AW072694 AW262153 AW467371 AI613259 AI679238 D54404 AA158103 AW105527 AW149739 AW150361 AW268387 
AW117708 AI951682 A1687440 AW674285 AA678365 AI5S7082 AA732095 AA019899 W45661 AA627300 BE613304 
AA765891 AA612935 AI814658 AW316915 R66594 AA514640 AA025040 AA031472 AW732076 AA029797 AI244560 
AI128734 AW381720 AI092360 AI263283 AW613175 AI890675 AI720156 AW631348 AI635106 AI278045 AA303979 
AA703505 W45449 AW078661 AI292052 AW381707 AI147854 AW381743 AA158905 AA303258 AA888144 AW195967 
AA428706 AA989559 AA617731 H19882 BE543418 AA830386 AA421302 W58652 T94995 AI869743 AI679145 
AW085971 N98425 AA765136 AI347027 AI356955 AA928038 AI679717 M458459 AA679281 AI367973 AI270041 
AA765135 AA732793 AI798447 M668646 AA251 008 AI984538 AI401737 AA056186 BE043308 AW662375 AI3021 10 
N50724 W96332 BE537047 N26983 AI567172 AA765296 AW673237 N29784 AA534275 AA084044 AW067973 
AW300766 T63398 W46823 R39790 AI3B41 85 AW298582 AA454814 AW069878 N67751 H05982 N23140 AI362647 
AI302086 AI767772 N25755 H53114 AA706133 T93511 AA429291 AA935294 AA987647 W02803 R66595 AI680795 
W23673 AW440794 AA722872 H49538 AW131042 AA531603 AA908665 AA040791 AA235312 W52205 N93444 R82180 
H02759 H79696 AW088894 H56079 AA961 143 AW067776 AW973745 AA01631 1 AW071227 AA017511 AI753994 
W47374 T64155 AA296092 AI698625 AA558158 AA296088 AW794259 H01963 AA149267 AA485076 AA975856 H44938 
AA035396 AI955555 H46289 AA486161 AI631222 AA359047 AW794253 AI806962 AW243930 AA526145 AW878734 
AA018464 AA132031 R67220 R79152 AA296093 H54300 A1005160 BE242548 AW992803 AW878644 AW878666 T27742 
R82471 AW517604 AW472738 AI282904 R39791 AA486098 AW467891 AW960520 AA551736 AA056621 AW945197 
R66373 AA554236 BE242202 AI904376 AI832590 H19484 R0D890 AI627677 AA302287 AI869451 AI734855 AI708073 
AI832902 AA585184 AW204299 AA055565 D12417 D11975 T63543 AW664099 R54423 BE612712 T96340 T63985 
AA598917 T40735 T64053 AA149284 AW272548 AA363445 AA042893 AW300697 BE261973 T53501 T53500 AW878729 
AW878657 AW794391 AA069193 R01 553 H44875 AA385406 AA533968 M93060 AL135600 W96331 AA017651 
AA018849 M017692 H85337 BE278690 M731598 AA018512 AI076813 AI022644 R02585 X5222D AW296894 AA825671 
AIS99321 AI3S3601 AW592611 AI146747 AA608921 M158365 AW590007 AA354519 D20081 R02704 AW798339 
M92422 AA094903 AA007676 

AI352558 Z82248 X78138 NMJ03405 AU077248 M223125 S80794 D78577 AI124697 AW403970 BE614089 BE296713 
BE621334 L20422 X80536 D54224 D54950 X57345 N29225 AA127798 AA340253 F08031 AA192540 H67636 AA321827 
AW950283 AA084159 BE538808 AW401377 AA256774 C03366 W46595 W47608 AA305009 H69431 H69456 AL120082 
H11706 AA303717 AA361357 H22042 H78020 AW999584 AA134368 M32291 1 AA322961 H60980 N85248 N31547 
H79624 T1 1718 W85826 AW894663 AW894524 BE167441 BE170015 AA304626 AW602163 AW998929 AA156681 
AA151067 BE002724 AA608688 H82692 BE155392 AW383636 BE155394 AA487004 AW383504 AI342365 R82553 
W16498 BE155344 AI143938 R69S01 AA322873 AW340648 R25364 AA367935 AI559406 AA033522 AA374252 
AW835019 A1922133 AI697089 N99662 AW1B9078 A1199376 AW151598 W59944 M662875 W94022 AA299055 
AI039008 AI829449 AA583503 AI635674 AW131665 AW73820 AW273118 AW900930 AA908944 AI688035 AW17Q272 
AI082545 AW468176 AI608761 A1082748 AI91 1682 AI248943 AI831016 AA192465 AI218477 AA938406 AA385288 
AI809817 AA905196 AI191245 A1470204 AI188295 AI421367 AI125315 AI087141 AA629032 AA74D589 AI554181 
AA1 50830 AI248541 AI077943 AA775958 AA864930 AI261476 AI123121 AI310394 AA862331 AA872478 BE537084 
AI205606 AA720684 AI872093 AW150042 AL120538 AA219627 AA988608 C21397 AI359337 H25337 A1089749 
AA605146 AI359620 AA150478 AI359738 AW383642 AW995424 AI766457 R56892 AI089839 W61343 N69107 W46459 
AA565955 N20527 AI279782 W46596 AA776573 H23204 AI866231 AI083995 N21530 M126874 D82630 W65437 
AI086917 AW382095 AI086877 H59844 AW340217 W85827 L08439 AA262704 AA505380 W47413 W94135 AA223241 
AW089153 AA084101 BE538000 AA096125 T28031 AA491574 R84813 AA774536 AW383522 AA155615 AW383529 
AA491520 AW028427 AA171496 AI469689 AW664539 A1811102 AI811116 BE464590 BE350791 H78021 T15405 H21979 
AA219489 H13301 AA505883 AI864305 AW23963 AW0844O1 F04963 R69858 H67097 AI917740 AI655561 H69864 
AA033631 AW383484 AI886261 H25293 AA51 3281 AW271 187 H1 1617 N79982 AI174338 A1904207 AI9042D8 BE614558 
W94127 W65436 AI272249 M70001 8 AI579932 AI085941 AW152629 

AA334551 BE008229 AA307537 AW961156 AW995894 AW995826 NM_006751 M61199 AA045603 AL036372 AV645606 
AI688095 AW351901 AA101337 AA101345 N73342 BE018030 BE569044 AW841975 AA373388 BE090412 H95440 
N53845 R67867 AA093441 AA363427 H93708 AW023134 AW994986 AW994989 BE090429 R23614 A1567932 H03726 
H01 101 H01867 AA548743 AI671806 AW872949 AW672941 AA742447 AI1997B8 AA045604 AI537465 AI741796 
AW242217AW131463AI765302AI683923AA889762AI804889AI986437 C06049 BE502340 AI695651 AI491970 
M496804 AA281008 AA665699 AI473814 BE301445 AA707837 AA551925 AI017348 AI208185 AA775203 M156296 
AA557463 H95441 AA768547 AW769358 AA991 197 AA181 954 AI091 389 AI147289 AW771837 AI638582 AA84441 1 
AI374750 T29320 AW951272 AW085923 H02834 AA843259 AA814696 AW183290 AA158453 N68125 N69039 AA100423 
AA101346 AI918720 H01102 R67868 H01868 N66438 R46580 AI858433 AA599560 AA187577 AA157481 AA361520 
AL047827 AA158452 R21688 AW964874 AA3251 61 R40871 AW752395 AW375924 R1 3355 AA281 174 AA428908 
AW450979 AA136653 AA136556 AW419381 AA984358 AA492D73 BE168945 AA809054AW238038 BE011212 BE011359 
BE011367 BE011368 BE011362 BE011215 BE011365 BE011353 
AA404418 AI217248 

AA353093 AW957317 AW872498 AI560785 AI2891 10 AW135512 X97261 T68873 

AI743860 N49543AW027759 BE349467 AI656284 BE463975 R35022 AA370031 AW955302 AL042109 N53092 AI611424 
AL079362 AI969290 AI928015 BE394912 BE504220 BE467505 AI611611 A1611407 A1611452 W56437 A1284566 
AI583349 AW183058 AI308085 AI074952 M437315 AA628161 AW301728 AI150224 AA400137 AA437279 AI223355 
AA639462 A1261373 AI432414 AI984994 AI539335 AA401550 AA358757 AI609976 AA442357 AA359393 M437046 
AA370301 AA429328 AW272055 AI580502 A1832944 AI038530 AA425107 AI014986 AI148349 AW237721 AW779756 . 
AW137877 AI125293 M400404 R28554 
genbank_AA608588 
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123533 genbank AA608751 AA608751 
125091 genbankJ91518 T91518 
123964 genbank_C13961 C13961 
102491 entrez_U~51010 U51010 
5 118475 genbank_N66845 N66845 
118581 genbank_N68905 N68905 
113947 genbank W84768 W84768 
101447 entrez_M21305 M21305 

101657 13349 1 NM_005381 M60858 AW373732 AW373724 AW373689 AW373629 AW373609 AW373776 AA187806 AW386946 

10 AW374207 T05235 AA216203 AW385556 AA306940 AA305526 AA315451 AL036757 AW373711 AW403124 AW403640 

AW377084 T27360 H62638 F06957 AW377051 AA554779 AA378558 AA096007 AW352407 AW302637 F07929 H17433 
AW382712 H05665 F07292 N39875 AA089729 H62556 N42842 R12952 AW373735 AW364155 AA056183 W39185 
AW382708 N32488 AF114096 AW375993 AI133559 W52561 AA603040 AA133710 AI928796 AW176370 AA827519 
AW338437 AA521 142 T29341 AI800461 AW317032 AA703914 AA860830 AI859203 AI445772 AA714334 AI817066 
1 5 AI832027 AW510442 AI635802 AW088306 AW068872 AW408555 AW467542 AA552657 AA152367 W32081 M582124 

AA074040 AA931657 AI051154 AW410203 AI921644 H17434 AI832330 AW404836 AI925038 AA088423 M954166 
AA580453 AW021292 AI267215 AW080082 AW383778 AI933053 AI919097 W31557 N90245 AA931591 AA563995 
F36352 AA056184 AA476294 AA641327 AA533550 AI749630 W58323 AA569119 AA508573 AI809050 AI378996 
AA411362 AW407505 AA938104 AA074041 AA632876 AW193748 AA507873 AI270128 AI472365 AA411363 AI523216 
20 AI719965 AI816302 AA182681 AI707990 AA133588 AI758537 W60253 AI460308 AA135423 AI083904 F04188 N89693 

AW408776 AI678595 AI270568 AA722059 W58234 F33650 AA090547 AA285108 M425981 N85079 D20218 AI273980 
AA159028 F03226 AW247914 N26918 AW272741 N90109 H05666 N23327 AW247953 R44748 AA962015 F03558 
AI752394 AW409913 AW248398 AI816463 AI752393 AA325370 AA263089 AI570130 AI971951 AI160658 AI357350 
AW168686 AL121075 AW050536 N21672 W67748 AA514242 AI127386 H14607 AI185752 W79364 M088520 AA152476 
25 AW351940 AW373683 AI940524 AW374953 T565C0 N24329 AI940720 AW374933 AW374947 AW391913 AL138337 

AW376241 AW062943 F26666 AW410202 AW062958 F34529 AW381807 AW393315 W17147 AW176359 AA664576 
AW380424 M306040 AI745674 AW300951 AIJ88579 AI438973 AI305271 AA433818 AA612807 AI831809 AI940409 
AA1 58663 AI572988 
108931 genbank_AA147186 M147186 
30 103138 entrez_X65965 X65965 
103432 entrez_X97748 X97748 
119174 genbank_R71234 R71234 

133678 11235 1 AW247252 M346143 NM 000270 AA381085N91995X00737 AA381079AA296473AA2961 10 AA315735AA31 1617 

AA326750 AA376804 AW403290 T95231 M13953 T47963 H82039 AA279899 AA627997 N76320 N99527 H37842 

35 W20095 AA457308 AW469547 AA724143 H83220 AA319496 W86334 W30892 R89169 R99427 N41854 H47286 

AA348094 M045089 R6301 6 AI922219 AI02490B AI096488 AI885005 AA194872 N90489 AI452544 H7241 1 AA2B2427 
AA430735 R68963 R22453 H70385 AW129369 AW467320 AW519082 AA345018 AA582183 AI951789 R65918 N3051 1 
AI979189 AI280889 AW273191 R66531 AI285845 AI675927 AI421990 AW190879 H37794 M699667 H68427 AA954388 
AI188757 AI140048 AA430382 AI204151 AW247864 AA559099 AI431420 AA548276 AI149466 AA772669 AA694388 

40 AA724168 M301651 AA281952 AA779925 AA234750 W86290 AA913603 AW511745 AI50D597 AA814922 AA835040 

T47964 H53998 AA975804 R98710 AI077604 N70252 R98084 AW250171 H69268 AI597614 AA970746 AA972548 
AI3771 16 R62962 H16737 R89070 AA731329 R66532 N54354 AI818832 H81944 N71557 T95122 W86463 AA437095 
AI431999 AI915724 N53851 AI674743 AA457307 AA21 1475 N64444 AI799145 H72853 R99335 H60413 AA770367 
AA156105 AI269937 H64029 H89728 R65819 AW470496 AI873318 AI735713 H82987 C02447 AI478666 T27651 

45 AI699770 AW025156 H69719 AI984717 N69225 AI459856 AA953577 AI424691 H13843 R22404 AI873796 A1336002 

N70898 AI420854 AA541792 AA346142 AI000814 AI828348 AA045090 T51257 N90434 H13890 N73184 AI708083 
AA781606 AA329050 AA339985 R68954 H64795 W04186 H16845 
119416 genbank_T97186 T97186 
119559 NOT_FOUND_entrez_W38197 W38197 

50 123473 genbank_AA599143 AA599143 
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Pkey: Unique Eos probeset identifier number 

Accession: Accession number used for previous patent filings 

ExAccn: Exemplar Accession number, Genbank accession nu 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 





Accession 


ExAccn UniGene 


UnigeneTitle 


115819 


AA426573 


AA486620 Hs.41135 


AA486620 


132837 


D58024 


AA370362 Hs.57958 


AA370362 


101545 


M31210 


BE246154 Hs.154210 BE246154 


102898 


X06256 


NM_002205Hs.149609 NM_002205 


101192 


L20859 


BE247295 Hs.78452 


BE247295 


102915 


X07820 


X07820 Hs.2258 


X07820 


105330 


AA234743 


AW338625 Hs.22120 


AW338625 


107385 


U97519 


NM 005397HS.16426 


NM_005397 


102024 


U03877 


M301867 Hs.76224 


AA301867 


134416 


M28882 


X68264 Hs.211579 


X68264 


103036 


X54925 


M13509 Hs.83169 


M13509 


104865 


AA045136 


T79340 Hs.22575 


T79340 


106124 


AA423987 


H93366 Hs.7567 


H93366 


105330 


AA234743 


AW338625 Hs.22120 


AW338625 


109001 


AA1 56125 


AI056548 Hs.72116 


AI056548 


104764 


AA025351 


AI039243 Hs.278585 


AI039243 


133200 


AA432248 


AB037715 Hs.183639 AB037715 


105263 


AA227926 


AW388633 Hs.6682 


AW388633 


105178 


AA1 87490 


AA313825 Hs.21941 


AA313825 


109456 


AA232645 


AW956580 Hs.42699 


AW956580 
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TABLE 5A 

Table 5A shows the accession numbers for those pkeys lacking unigenelD's for Table 5. The pkeys in Table 7 lacking unigenelD's are represented within 
Tables 1-6A. For each probeset we have listed the gene cluster number from which the oligonucleotides were designed. Gene clusters were compiled 
5 using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 

Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 



1 0 Pkey: Unique Eos probeset identifier number 

CAT number: Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT Number Accession 

115819 10241 1 AA486620 AF205940 AA297524 AB034595 AAQ81335 NM_016242 AA188323 AA297537 H88204 AW953081 W31695 

AW582203 AA248250 AW68121 1 AA426230 AA464B07 AA426155 N44141 AA347390 AA770661 AI333225 N36136 
AW665724 AA431894 AI374976 A1400254 AI338446 AA186695 H83205 W04527 AA487066 AI051414 AA918383 
AA426573 AA425620 AW438654 AA090513 BE167284 BE167291 AI301726 

102024 14505J AA301867 AW957981 R27614 AA155808 AI920990 AI740711 M301026 M301015 AI220981 AI857670 AI537140 

AW015210 AA030000 W46890 H44021 AI355967 AI651735 AA058479 AA146932 T58265 R85890 AA047810 AA017387 
AW026093 AA971133 AI827263 AIQ56415 AI355994 AI127691 H46603 U03877 NM_004105 AA157357 H42844 
AA146824 M187709 AA187269 AA304348 AA147292 AA361687 AA156041 AA330636 R32929 AA321130 AW950260 
AA082157 AA029129 AA303708 AA028155 D31561 T84689 AA302493 BE153057 BE153181 W39408 AA187200 
BE153250 AW383337 AW382622 AW382647 AW750072 BE153060 AW382630 AW371865 AW392464 AW382664 
AW382658 AW382650 H61647 AW365075 AW365049 AA373397 BE072779 BE072781 Z30254 W24381 BE153254 
AA040442 BE072729 BE072731 N94740 M146945 AW802737 AI826799 AI085395 R34034 H65140 AA082800 H88275 
AA1 47824 R63882 W80899 AA296413 AI765300 AI862426 AW022055 AW300003 AI743784 AI862635 AI985428 
AA147764 AW573245 AW190290 AI040898 D57613 IM63457 AA148082 AI028458 AA148110 AW814489 N75105 
AW629443 AA704122 AW582220 AA181240 AA057495 AI418224 AI251751 AW388595 AI472205 AW470672 AA102546 
AA789046 AA182416 AA062668 AW300732 AI288220 AA181982 AA146825 AA028130 AI985522 AA303344 AA081313 
N69082 AA182035 AI867128 AA100902 AA605087 N67178 AW020324 AW890446 AI472191 AI335591 AI597837 
AI081 143 AI335681 AA040443 AI128067 AI678244 AA018303 M157260 W80792 AI934590 AI096430 T54343 AI445350 
AA165196 M780683 AA603631 AA047787 AA968580 AA912645 AW890504 AW026913 D56983 H52088 AA156121 
R30848 AW023036 AI590960 N67345 AI753225 AI753283 Al 1 83768 M147818 H89101 AI362141 H89205 AI14771 1 
AA321129AA668622AA343479 AW069438 AI422376 AW629270M013413AI221948AA970605N52335 H38366 
T91 180 AA657841 AA0173B6 AA152227 AA187593 AI91 3340 AI719313 AI969943 AI701271 AI004328 AI863348 N93659 
H65093 H25736 D57007 D56957 C00987 D51839 D56661 AI472137 AI971002 D56971 BE048830 D57972 AI589286 
AI361055 AI361071 AI292223 AA155898 D57139 D57981 D57345 AI420034 D57332 D57959 AA875933 R33493 N67558 
D58353 AA188394 M147986 AI16G640 AI363165 H40538 AA578137 AW950265 AA300943 AI128999 H4G584 AA917355 
N57820 AA320504 H51959 H25737 

101545 24607 1 BE246154 M31210 NMJJ01400 AA193392 NH.016537 AF233365 AF022137 H27787 AA370448 F05373 T27666 W21494 

AA036907 AI249966 N93476 F01623 AA304390 AA30880B 

109456 180633J AW956580 AA886361 AI147670 AI090115 AI168683 AA232645 H99504 AA374707 AA380875 AW139567 AI735132 
BE439385 AW629780 N28322 AA232789 AA2327S0 N73285 

103036 17145J M13509 X54925 NM_002421 M16567 X05231 M15995 W39354 AA186634 AA852324 AA187507 AA081 149 AA1 86524 

AA187264 AA187361 AA386155 AA185973 AA374217 U78045 AA081230 AA188049 AA186393 W56827 AA852602 
AA157468 AA308204 AA186754 AA186808 AA082516 AA304334 AW376428 BE439384 AW376420 AA156273 T18504 
AA1 86521 W49496 AW084608 AA083575 AA372360 AW963590 AA1 32297 W47445 M186376 AA1 57628 AW003999 
AI037890 AI858060 AI589010 AI743739 AI452673 AW304188 AW117854 E3E439933 AA157416 AW778966 AI038497 
AA081006 AA100829 AA181048 C02231 T27821 W23960 AW954802 A1471432 AW801295 AW801289 AW801603 
AW801523 AW801292 AW801542 AW801601 AA181134 AI445147 AA191501 AA582862 N94407 AI147810 AA181880 
W49497 W52714 AA188249 AI932881 AI082493 AA503656 AA182682 AW801393 M182830 AA181882 AA182826 
AI613182 N94510 W47343 AIC85755 AI076956 A1918426 AA081208 AI282835 AA147528 AI081490 AI654536 AA181875 
AA081282AA186389 C06085 AA083542 AI800644 AA157642 AA101069 AA157752 AA158121 AA143331 AA081283 
AA852603 AA188296 AI932880 AW449628 AA187348 C02091 AA514656 AA082736 M308786 M143201 M16567 

133200 28960J AB037715 AI351347 AI375796 AI884765 AL121 124 W01068 AI807275 T95240 R42807 AW515645 AI057314 AI033520 

AA057671 N70215 AA054215 AW204183 AA552149 T95130 AW796310 AI866520 AW275564 AW796308 AI637901 
AW197404 T78406 AA456232 AW206463 AA779800 AI052693 M026744 AA454623 AW470729 R45490 AW770258 
AI03B393 AI290170 AA722734 AL1 21 125 R41608 AI862414 AA83861 1 R45582 AI278083 BE466849 BE219944 
AA418030 BE041555 AA578572 T1 6528 AW006344 Z39782 AI244848 AW137344 AA707400 AI032028 BE540464 
A1094265 AI184281 AA931890 AW382744 AW382729 AW020448 AW827237 AA431 226 A1672059 AW772345 N70172 
AW022003 AI862704 H19344 R6151 1 AI080204 H16566 AA432248 AI767980 T16688 AI984342 AI217478 AI767095 
Z38551 AI359566 AI361437 AI041000 R07033 H16608 H19054 R12874 R61567 N98368 BE221199 Z42320 AA094554 
R07078 AW860886 AA418090 R41262 

1 32837 256666J AA370362 AA3641 10 AW959554 AW371737 AW382068 AW604716 AW604713 M487827 AW371674 AA429137 

BE503321 T93570 W72803 AI093076 AA487977 AI2415B2 BE439445 AW204065 R51635 AI802994 T10362 W68553 
AI866215 AW152154 M700716 AI127443 R15824 AI537587 AA9531 10 D58024 AI52081 1 AA693670 AI453280 W76329 



NMJ02205 X06256 M13918 BE070866 AW239485 AW996127 BE273894 BE272590 BE410252 R25975T1 1786 T1 1787 
AA301 142 AA301 165 AW960506 BE272819 AA386086 T39391 AA285303 M370580 D58585 T58668 AA156213 W24142 
AA343323 AW796067 AA151 1 97 AA376121 R94782 AA302363 H90357 R82621 AA301677 H55997 AW796059 W92358 
AL046458 M471 198 M301 952 R46287 R82594 H031 86 AA187706 R32562 R27094 R25947 R25320 AW949809 H1 3505 
H79049 R32403 H11213 R39710 H49765 H21142 H21006 AA417664 W52075 N56771 AA284240 N98556 N30907 
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105330 
104764 
104865 



182497J 
90967J 
102037 J 



106124 54542J 



101192 15367J 



AA707335 AW603781 AI340367 AI814584 AA5241 82 AA370076 AA418785 AA704082 AI806851 H25513 T56388 
AA419627 H03986 H20963 T56245 A145971 5 AW973768 AI334096 AI693020 T63414 R82646 AW167251 H55998 
AI274916 AA778367 AI755253 AI033667 AW083222 AA181979 R26865 AA6S1627 AA70S329 AI798648 AA6127G9 
AI160180 AI274973 AI039264 AA301880 AI042429 AA307632 AI085688 AI278366 AI498890 AA303B65 AI954844 
AA502380 AA156334 AA723480 AI803584 AI581026 AA304584 N51038 R94702 R69814AW1 50962 AI570049 AA588807 
AA151198 T53400 AI567709 AI185326 AA309205 AW338969 R53903 AA991891 AA301643 AI493337 AI026049 H25514 
AI741075 R28S32 AW166445 AI333Q68 H49978 H91267 AA558193 AW079663 AA627380 AA807401 AI199956 AA66S118 
AI718216 AW193228 A1077745 AI500495 AI266059 AW080383 R06468 R26757 R32404 AA716599 W92322 AI077734 
AI270181 R46198 AI217540 AA304045 AA305421 AW074445 AI458256 AW089568 AW571605 BE162930 H41009 
AA578313 AW874497 AA181284 AA861947 T29451 D20841 T58618 AA418731 AI282500 AW081407 AA604560 
AA729855 AI262538 AI580225 

X07820 NM 002425 BE271570 AI263526 AW295143 AI829878 AI973162 AI085155 AA857496 AA709305 C02220 
X68264 NM.006500 AF089868 BE25746 1 BE275425 AW9971 54 AI902799 AI902803 M78206 AA085591 AW392972 
AA325490 BE006161 AA349269 AA323568 AL042548 AA191 148 AA187703 AA322791 AJ297452 T11625 AW366487 
AA303513 AA186961 AA1 73480 N28330 N28379 W40320 AA187118 H03595 M402709 BE407476 H06354 BE276589 
AA351284AA379921 AL138060 BE410587 AA113094 AA340481 BE277483R21191 R79518 N86170 AA320505 
AA296065 AW951900 AA658897 AA650052 AA654304 AA191691 N26649 AW080963 AI265800 N72019 AI453458 
AA092563 AA402310 AI439450 AI061054 AA302358 T71566 AA302047 AA303432 N21289 H27357 AA303504 AI174583 
AW151762 AA181958 AW880618 M630773 A1889539 AW901058 AI373405 AA341941 AA086217 AIB75590 AI653936 
AA633570 AA987619 AI270656 N93847 N40689 AW517517 N20030 W95985 AA303955 H89170 AA309917 N21642 
AA373132 W38517 AI687806 W76182 AA101055 AA036915 N45635 AI744510 AI669803 AI039157 AI126355 AA634607 
AW131 120 AW196838 AA190601 AA91 1 130 BE221320 N92355 AA036752 H03696 AA588873 AI458868 AI041818 
AA090477 AI093248 AA304755 AL137942 AL044688 AI083709 AI150965 N88891 M635675 AA59489B W94657 
AA1 82823 AW166205 F27886 R79246 F37329 AA565697 AI075739 AI088654 AI094287 AI204256 AA0952O3 T93020 
AA688298 M057324 N23442 AA07541 1 AA305046 AI031688 AI1 91 503 AA1 11887 AA1 12264 N27929 M1 87509 
AI375522 AI474006 H06297 AI826177 N48880 H28333 AA075490 R22809 W79542 AI055934 AA042901 AA173481 
AA301986 W74531 AI051747 AA187715 AI888888 AA993017 AI057530 T92954 N80227 AW273595 AI351260 AW170643 
AW292979 AA302605 AA302330 BE349495 AA328602 AA302361 AI470984 AA155943 AA155914 
AA313825 AW960347 AF223468 NMJJ16613 AA186345 AA186508 AA081195 AA147972 AA346943 AW961667 
AA187222 AA187207 AW371052 AW449751 AW748803 AW391606 AW371047 AW371057 AW371085 AW362895 
AW371092 AW377556 BE010930 AI016882 AA247878 C04398 C05158 F1 1398 AA188315 H23385 R55086 H15346 
AA029106 AA228114 H17005 F08498 Z43376 AA095582 AA055186 AA463361 R15218 AA2991 32 AW1 03578 W21538 
M428131 AA187115 AA157197 AA157167 AW371371 AA363562 AW965995 N55663 Z17878 AA228023 AI140342 
AA100927 AA496988 AA055917 A1C89303 AW014967 AW090248 AW338371 AW131066 D62963 D79713 AI583950 
AI336781 AI5007O5 AI471485 AW090239 D79784 D61847 DS2789 D61842 AI086327 AI273381 D61815 D63043 AI913548 
AI280560 AI510828 AA029996 C16343 C16513 AI075741 AW516308 AI804764 M948068 AI35B588 AW103452 
AW573063 Z39445 C164S9 AI949870 F04712 AA147823 AW026284 AI151538 AA081303 AA613B90 AI251865 AW086499 
AA992111 AI862091 AI373465 BE502094 AI922270 AA884288 AA157079 N56963 AW189145 M428080 R55056 
M884068 AW771716 AA1866B2 C16364 H15723 AI921181 AA156888 H17006 AA187490 AI4D0994 AA345942 H28533 
AW129047 R41656 H14636 AA995041 D58370 Z21131 D58186 AI383271 AA643977 D5B044 AI934302 AW779425 
F09065 H14930 AA890693 H23274 

AW388633 AW378440 AW388283 AW388339 AW388333 AW388414 AW388413 AW388B07 AW388453 AW388687 
AW388480 AW388591 AW38871 1 AW388511 AW388438 AW388570 AW388449 AI694383 AW237145 AIB52991 
AI964041 AW366319 AW366321 AW961938 AW46921 1 AI634155 AI492186 AI624430 AI677955 N26502 AI963871 
AW378431 AW378421 AI015391 AW352126 N59336 AI352317 AW197113 N67998 AW778935 AI476054 AI206626 
R371 16 R4021 1 AA227926 AA639698 R38073 A1001745 T32854 AI619649 AI423703 F10774 AW388B15 T16595 H05894 
AW338625 R43226 R51640 AI307645 AI308100 AI085787 AI420357 AIB92610 AA877160 AI95336B AA234743 
AI039243 R68234 AA025351 AA971063 AI537757 AA025362 R81636 T86650 

T79340 AI742317 AW182676 AW451460 AI420964 R43284 AA088179 AW590886 AW269529 AA045187 AI521736 
AI827455 AA045136 AW271709 AI004344 AA639631 M744417 AA744218 M045357 AA045351 
H93366 AI653547AA336265AW966175 BE566451 R71178 AI630656 AA234331 N55039 AA305S32 AW960431 R34044 
R32254 AW020970 AW451281 AW275041 AI636933 AI655640 AA423986 AA642466 AI684063 AI633876 AI624897 
AA814795 AW590328 AI889166 AW243541 AI439691 AW473445 AI475516 AA741228 AI127534 AA165143 AI074714 
AI654076 AA400674 AI560249 N507C9 AW438621 AI806810 AI434579 AI308184 AA423987 AI141272 AI565586 
AI338440 AA219628 AI246643 AI985809 AA724260 AA633988 AI364172 AI798439 AI650801 R33503 AI435891 
M903649 T96161 AA665538 AA219620 AI309952 AA400707 BE247066 R32178 AI275962 M661602 AW003197 
BE466649 AA831 198 AI620052 AI825387 AI634037 AI670978 AI670979 AI655092 R32304 AA828858 AI382428 
AW023660 AA262892 T26891 AW089917 T26926 R32227 

NM_005397 U97519 AW899329 AI902337 AA077792 AA078525 AW376607 AA077946 AA070415 BE208721 AW167958 
BE293050 BE208240 AI648698 AA101314 BE393348 BE305122 AA077591 BE274036 AA313687 BE392220 BE378954 
AA171461 AA464821 AW938242 AW938224 AW938243 AW938232 AA147953 N64294 AA205218AW305065 AW517478 
AA307983 AA377023 BE563629 R99976 N80294T87719T87928 M496849AA486344 AA204938 AW370448AA318242 
AW964384 H92423 W95317 BE378774 BE391 1 56 AA349138 AA173095 AW513198 AA037672 AA148029 AA169726 
W04791 AA075508 BE382937 BE395034 AF139793 AA961734 N48612 H64714 AW151251 AI565113 AI566881 
AW087370 AA631168 AA622014 AW51 3098 AI85781 0 AW1 52287 A1052596 AI983246 AA024856 AI912456 AI677938 
AW026403 AA972537 AI088497 AW9998B9 W94582 AI140166 AI160659 AI566868 AA1 01263 AW190390 AW166466 
AI401207 AI418156 AI625265 AI146298 AW008592 BE223020 N58926 AI308797 AA037673 AI935992 AI304706 
AA024939 AI216589 AI610423 AI354621 AI500677 AI679389 AI799310 N64508 AI128756 AI679897 AW589535 
M989333 AI500527 AA565479 AA913529 AI923295 F21691 AA989376 AI699064 AA902447 AI690910 AA772659 
AA204983 AI337895 R99975 H65205 AA340766 AI339441 AI913855 AA450293 AW192010 AA07041S N72401 AI371481 
A1247108 AI371261 AI364987 AI280171 AI259104 AI868756 AA909836 AA983640 AI973271 AA913092 AI868205 
AI1441 12 A1190975 N58085 A1566638 N93405 AW150504 AW296846 AI687036 AA902984 AI824460 AI625047 AA653148 
AI61 1228 AW131922 AA862687 M902519 C01732 AW796045 AL044660 

BE247295 AW068092 AL041313 AA159244 NM_005415 L20859 AL135570 W47073 AW516906 BE388271 BE408629 
W4S972 BE293646 BE256647 AI075010 AL041095 AA285300 AL039560 AA368740 W26602 M399344 M039235 
W27S31 AW834898 AW834914 R93390 AA378039 AV649660 T53674 N98824 AA399974 AW843378 AA368267 R08256 
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AV653575 R27900 N4821 5 AW366371 N455Q0 AV652967 AI889251 AI080457 N39021 AI738542 AW242849 AI857471 
AI859775 A1582830 R75850 N66564 AW341636 AI499006 AI887217 AW026694 AW182840 AA039313 AA831346 
AI393465 AW069210 AI743830 AA744243 AA401310 AW439758 AW088152 R933S1 AA291379 AA225220 AW009358 
AI192879 AA291202 AI565089 AA225089 AA807688 AI052058 AI341641 AI066625 AA333864 AA159147 AI923912 
5 R75851 AI761 143 AW768588 AA394195 AI288450 AW51 2564 AI452775 AI056520 AA468602 M872566 AI434739 

AA291838 AI948623 AW768614 AI374753 AW068174 AA88490B AI199346 AI199347 W94946 AI159995 AA877642 
AI280646 A1307610 AA403310 R08205 AW182123 AI000999 R27808 AW026571 D20816 AI560350 T27667 AW950271 
AI174628 AI432042 AI424528 AA909562 T17342 AI783866 
109001 146370J AI056548 AW409843 AW263540 AA723669 AA909334 AA156120 AA157141 AA156125 AW409866 W19499 AA157229 
10 AW887435 
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TABLE 6: 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

AUC1 : 70«> percentile of average intensity (Al) for probeset at each of 2,6,15,24,48, and 96 hour timepoints minus 70* percentile Al at 0 hrs, 

summed over 5 experiments. 

AUC2: AUC1/90 1 " percentile of Al for aorta, aortic valve, vein, and artery. 



Pkey Ex.Accn UnigenelD UnigeneTitle 






314941 AA515902 Hs.130650 ESTs 






327414 predicted exon 






321911 AF026944 Hs.293797 ESTs 






331578 AI246482 Hs. 249989 ESTs 


K77L 


in? 


332466 AB018259 Hs.1 18140 KIAA0716 gene product 






313513 AW298600 Hs. 141840 ESTs, Weakly similar to S59501 interfero 


324 


v>i 


320535 N50617 Hs.80506 small nuclear ribonucleoproteln polypept 


394.8 




326230 predicted exon 


357.2 




313556 AA628517 Hs.1 18502 


433.6 


19 


313665 AW751201 Hs.120932 ESTs 


-83 




324852 AI380792 Hs.135104 ESTs 


348.2 




314372 AL040178 Hs.142003 ESTs, Weakly similar to The KIAA0149 gen 


-49.2 




311877 AA084248 Hs.85339 G protein-coupled receptor 39 


-1309 




322252 AA632012 Hs.188746 ESTs 


-247.8 




312173 AI821409 Hs.304471 ESTs, Highly similar to AF1 16865 1 hedge 


-1025.8 




319795 AB037821 Hs.146858 protocadherin 10 


203.6 




313350 AW591949 Hs.57958 ETL protein 


183.8 




326759 predicted exon 


1654.4 




300318 AW444502 Hs.256982 ESTs, Highly similar to AF1 16865 1 hedge 


-346 


1 


313978 AI870175 Hs.13957 ESTs 


576.6 




306840 AI077477 Hs.307912 EST 


55.4 




310272 AF216389 Hs.148932 semaphorin Rs, short form 


-127,6 


n 


315044 BE547674 Hs.204169 ESTs 


-102.6 




321325 AB033100 Hs.300646 KIAA protein (similar to mouse paladin) 


1080.6 




303251 AF240635 Hs.1 15897 protocadherin 12 


1270.8 




302378 AL109712 Hs.296506 Homo sapiens mRNA full length insert cDN 


915,8 




315060 AA551104 Hs.189048 ESTs, Moderately similar to ALUCJHUMAN ! 1236.8 




332048 AW337575 Hs.201591 ESTs 


522.6 




337214 predicted exon 


269 




311598 AW023595 Hs.232048 ESTs 


795.4 




304782 AA582081 gb:nn32h08.s1 NCI_CGAP_Gas1 Homo sapiens 




312802 AA644669 Hs.193042 ESTs 


349.6 




302680 AW192334 Hs.38218 ESTs 


633.6 


MQ 


317452 AA972965 Hs.1 35568 ESTs 


360.8 




318558 AW402677 Hs.146381 RNA binding motif protein, X chromosome 


700.2 


6.6 


312149 T90309 Hs.269651 ESTs 


274.2 




319267 F11802 Hs.6818 ESTs 


238.2 




321510 H75391 Hs.255748 ESTs 


231.8 




326198 predicted exon 


581.6 




315730 H25899 Hs.201591 ESTs 


281.6 




310442 AW072215 Hs.208470 ESTs 


-213 


0^3 


331237 W87874 Hs.25277 hypothetical protein FU21065 


285 


0.5 


300469 BE301708 Hs.233955 hypothetical protein FU20401 


26.6 


0.3 


338316 predicted exon 


1494.2 


34.7 


330968 R44557 Hs.23748 ESTs 


975.8 


1.8 


331019 NM_006033Hs.65370 lipase, endothelial 


201,2 


0.9 


331261 BE539976 Hs.103305 Homo sapiens mRNA; cDNA DKFZp434B0425 (f 


478.61.3 


301822 X17033 Hs.271986 integrin, alpha 2 (CD49B, alpha 2 subuni 


355.2 


1.7 


325544 predicted exon 


1014.6 


9.4 


328700 predicted exon 


627.4 


62.7 


322882 AW248508 Hs.279727 Homo sapiens cDNA FLJ14035 fis, clone HE 84.8 


5.7 


336034 predicted exon 


782.6 


78.3 


316580 AA938198 Hs.146123 hypothetical protein FLJ12972 


746.4 


13.8 


309931 AW341683 gb:hd13d01.x1 Soares_NFL_T_GBC_S1 Homos 


134.813.5 


330692 R39288 Hs.6702 ESTs 


137 


13.7 


319962 H06350 Hs.135056 Human DNA sequence from clone RP5-850E9 


14.6 0.5 


338033 predicted exon 


540.6 


14 


314943 Y00272 Hs.184572 cell division cycle 2, G1 to S and G2 to 


-494.8 




332640 BE568452 Hs.5101 protein regulator of cytokinesis 1 


-600 


1 


338158 predicted exon 


311.2 


31.1 


327036 predicted exon 


351.8 


35.2 



165 



WO 02/079492 



PCT/US02/04915 



302655 AJ227892 Hs. 146274 ESTs 

327568 predicted exon 

324801 AW770553 Hs.14553 sterol O-acyltransferase (acyl-Coenzyme 

317850 AI681545 Hs.152982 hypothetical protein FLJ131 17 

322818 AW043782 Hs.293616 ESTs 

324626 AI685464 Hs.292638 ESTs 
317224 X73608 Hs.93029 
310955 AI476732 Hs.263912 
315240 R38772 Hs.172619 



318617 AW247252 Hs.75514 



predicted exon 

iWM NSHbUf HS.IttMljO ESTs ZU4.H 

324716 BE169746 Hs.12504 hypothetical protein DKFZp761D081 203.6 
330305 predicted exon 199.8 

308248 AI560919 gb:tq41g10.x1 NCI_CGAP_Ut1 Homo sapiens 

gb:at76d10.x1 Barstead colon HPLRB7 Homo 198.2 
Homo sapiens cDNA FLJ11674 fis, clone HE 191 .2 
tumor differentially expressed 1 189.2 
ESTs 187.6 
hypothetical protein FU22672 271 .6 

Homo sapiens mRNA for WDC146, complete c 



1247.8 24.2 
206 20.6 



315622 A 

323675 R 

312164 T 

300378 Z 

317478 A 

317559 A 

317207 A 
334834 



Hs.258188 
Hs.272168 
Hs.221074 
Hs.235873 
Hs. 107000 
4 Hs. 129977 
Hs.214505 



328548 
317103 
318013 



AL121460 I 
AA884000 i 



2 AW1 73339 
4 AW975920 
6 AA377578 



3.8173 
Hs.144078 
Hs. 154840 
Hs. 135665 
Hs.283361 
Hs.65234 



ESTs 
ESTs 

predicted exon 

gb:HUM337C07BCIontech hurra 
hypothetical protein FLJ20508 
predicted exon 

hypothetical protein FLJ10803 

ESTs 

ESTs 

ESTs 

ESTs 

hypothetical protein FLJ20596 



184.2 
182.8 
178.8 



169.4 
169.2 
321.4 
1047.2 



309687 AW236154 Hs.77385 myosin,lightpolypeptide6,alkali,smoothmu 168.2 

323329 AL134744 Hs.10852 ESTs 

40 312853 W05086 Hs.1 14256 ESTs 

313070 AI422023 Hs.161338 ESTs 

314096 AW977642 Hs.291742 ESTs 



312642 
339236 
317058 
311137 
310178 
320745 
317336 



332237 
312362 
331558 
316215 



313534 
317404 
311943 



315555 
317039 
331138 



AW292520 H 
AA888220 
AW052128 

AI217713 
AW207582 
AI936450 
H51896 
AW014637 
AW300366 
AC004076 
N52883 
AW015994 
N62401 
AI684535 



AI204202 
AW072916 



gb:oj15h01.s1 NCI_CGAP_Kid5 Homo sapiens 
gb:wx26c02.x1 NCI_CGAP_Kid11 Homosapien 
predicted exon 163.6 
Hs.147586 ESTs 161.8 
Hs.196042 ESTs 582.2 
Hs.147482 ESTs 161.2 
Hs.89278 hypothetical protein FU1 1 186 161 
Hs.130212 ESTs 160 

gb:xs63b05.x1 NCI_CGAP_Kid11 Homo sapisn 
Hs.129709 Homo sapiens chromosome 19, cosmid R3021 
Hs.102676 EST 159 

gb:UI-H-BIOp-abh-g-09-0-Ul.s1 NCI_CGAP_S 158.5 
Hs.48531 EST 158.6 
Hs.200811 ESTs 158.4 
predicted exon 157.4 
gb:Homo sapiens mRNA for immunoglobulin 155.8 
predicted exon 153.8 
Hs.29493 hypothetical protein FLJ20142 153.6 
Hs.50802 ESTs 153 
predicted exon 664.4 
152.6 
152.4 
152.2 
152.2 



Hs.78743 
Hs.126594 
Hs.26498 
Hs.152182 
Hs.44076 



Hs.239107 
Hs.126153 
Hs.28445 



zinc finger protein 131 (clone pHZ-10) 
ESTs 

hypothetical protein FU21657 

ESTs 

EST 

predicted exon 
predicted exon 
ESTs 
ESTs 
ESTs 



151.4 
151.2 
151.2 
150.6 
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316561 AI917222 Hs.121655 ESTs 



15 



302282 
318781 
3237Q9 
310790 
316833 
323176 
324188 
317441 
317584 
321798 



301909 
309196 
321860 
330187 
323042 
20 313636 



326077 
302004 
320668 
331212 
311268 
305159 
304510 



40 301808 
318608 
316499 
317011 
321840 

45 327365 
331264 



324137 
328262 
322349 
323504 



BE396283 Hs. 173987 
F11802 Hs.6818 
AW297246 Hs.288546 
AW1 92063 Hs.248865 
AW292614 Hs. 124367 
NM 007350Hs.82101 
AW274439 Hs.252709 
AA922798 Hs.196583 
AI825890 Hs.220513 
AI308206 Hs.181959 
AA206045 
F20956 

AI702609 Hs.15713 
AI904895 Hs.9614 
N47474 Hs.212631 

AA463571 Hs. 172550 

AA262397 Hs.201366 

AB024729 Hs.227473 

A1473096 Hs.1 33403 
M16951 

AI734258 Hs.245367 

AI927371 Hs.288839 

Y18264 Hs.123094 

AA805666 Hs.146217 

T88693 Hs.226410 

AI969727 Hs.231859 

AA659166 Hs.275668 

AA457391 Hs.119122 

AA772920 Hs.303527 

AW291944 Hs.122139 

AW449952 Hs.190125 

AI824879 Hs.211286 

AA001697 Hs.293565 

R40855 Hs. 100839 

AA719572 Hs.274441 



AI204491 Hs. 151 502 
AW292947 Hs.1 22872 
AI248760 Hs.150276 
N45600 Hs.46534 

AA278898 Hs.225979 
AW501944 Hs.127243 
AA211586 

AA825814 Hs.149065 
BE247449 Hs.31082 



AW1 34766 Hs.202450 
AW019873 Hs.146840 
AA393127 Hs.222762 



310489 AW451493 Hs.235516 
335946 

318155 AI041546 Hs.132133 



333977 
324845 
331139 
331131 



149.4 
149.2 

eukaryotic translation initiation factor 148,4 
ESTs 148.2 
Homo sapiens cDNA FU14190 fis, done NT 148 
ESTs 147.8 
ESTs 147.8 
pleckstrin honnology-like domain, family 229 
ESTs 147.6 
ESTs 147.4 
ESTs 146.8 
ESTs 146.8 
gb:zq77f05.s1 Stratagene hNT neuron (937 146.6 
gb;HSPD05390 HM3 Homo sapiens cDNA clone 
ESTS 263.8 
nucleophosmin (nucleolar phosphoprotein 146.2 
ESTs 146.2 
predicted exon 146 
polypyrimidine tract binding protein (he 145.6 
ESTs 1455 
UDP-N-acetylglucosamine:a-1 ,3-D-mannosid 145 
ESTS 144.8 
gb:Human Ig mu-chain mRNA VDJ4-region, 5 144.6 
ESTs, Weakly similar to ALU1JHUMAN ALU S 
hypothetical protein FLJ12178 144.4 
predicted exon 144.4 
sal (Drosophila)-like 1 144 
Homo sapiens cDNA: FLJ23077 fis, clone L 144 
ESTs 144 
ESTs 143.2 
EST,WeaklysimilartoEF1D_HUMANELONGATIONF 
ribosomalproteinL13a 142.8 
ESTs 142.8 
ESTs 142.8 
baslc-helix-loop-helix-PAS protein 142.6 
ESTs, Weakly similar to 1 207289A reverse 142.2 
ESTs, Weakly similar to putatwe p150 [H 142.2 
EST 142 
Homo sapiens mRNA; cDNA DKFZp434N011 (fr 
reticulon 3 141 
ESTs 141 
ESTs 140.8 
ESTs 140.8 
Homo sapiens mRNA; cDNA DKFZp434P0714 (f 
predicted exon 140.8 
hypothetical protein similar to small G 140.8 
Homo sapiens mRNA for KIAA1724 protein, 140.4 
gb:zn56d05.s1 Stratagene muscle 937209 H 140.2 
ESTs 140.2 
hypothetical protein FU10525 140.2 
gb:nj28g06.s1 NCI_CGAP_AA1 Homo sapiens 
ESTs 139.8 
ESTs 139.8 
ESTs 139.8 
predicted exon 139.6 
Homo sapiens cDNA FLJ10417 fis, clone NT 139.4 
ESTs 139.4 
gb:zf66d01 .s1 Soares retina N2b4HR Hot.o 1 39.2 
hypothetical protein PR02955 139.2 
predicted exon 139.2 
ESTs 138.8 
ESTs 138.6 
predicted exon 138.6 
ESTs 138.2 
gb:yi16g12.s1 Soares placenta Nb2HP Homo 138.2 
gb;yg87b07.s1 Soares infant brain 1 NIB H 669.6 



312498 
331252 
337407 
303973 
314582 
327373 
323367 
316207 
315231 



AW969635 Hs.283718 
R65706 
R54797 

H58539 Hs.151692 
AA668782 Hs.191284 ESTs, Weakly similar to ALU1JHUMAN ALU S 
W52470 Hs.34578 " ' - 



AA234591 Hs.304123 
AA832065 Hs.1 20260 
M705809 Hs.1 19922 



138 



gb;xx68a03.x1 NCLCGAP_Lym12 Homo sapien 
ESTs 137.4 
predicted exon 137.2 
ESTs 136.6 
ESTs 136.4 
ESTs 136.2 
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318592 
320906 
328937 
329073 
318231 
311992 
316497 
317677 
321680 



cold shock domain protein A 
2 ESTs 
predicted exon 



Hs.134228 
Hs.1 14145 
Hs.136849 
Hs.127868 
Hs.93704 



306573 

307383 AI223207 Hs.147888 EST 

311114 AW449382 Hs.195297 ESTs 
320579 



ESTs 
ESTs 
ESTs 
ESTs 
ESTs 

predicted exon 
Hs.172619 KIAA1106 protein 

ibosomal protein, large P2 



312083 
319354 
307414 
312771 
313004 
300995 
319323 
329451 
337603 
312480 
324934 
320723 
318188 
320873 
331005 



3 R15138 

3 AA884104 

3 N58198 

3 H09604 

3 AF241850 

1 AA282330 

1 N93416 
AA351109 
T87398 



135.8 
135.8 
135.8 
135.8 



135.6 
135.6 
135.6 



Homo sapiens clone 25052 mRNA sequence 135 
134.8 

ESTs 
ESTs 

ret finger protein 2 
ESTs 



AI242106 
AA018515 
AI274963 
AW510641 
F12650 



Hs.118228 
Hs.5437 
Hs.205816 
Hs.167367 

Hs.264482 
Hs.145900 
Hs.258018 
Hs.13287 



134.8 
134.6 
134.4 
134.2 
134.2 
133.6 
133.2 
132.6 
132.6 



BE003191 
AA614406 
AI139253 
AA347945 
AW130320 
AW419225 
AW452334 
W49701 
AA806536 



302610 
309485 
311880 
313981 
322442 
315099 
304793 
330815 P 
304044 T 
325222 
325889 
321447 / 
302990 f 
308106 
310536 
315257 
318787 
312306 
326788 
312234 
314482 



302623 AW836724 

1 AI791531 

5 N55761 

! AA256465 

I M554913 



Hs.1 19555 

Hs.227767 
Hs.256024 
Hs.108124 
Hs.256247 
Hs.128148 
Hs.29667 
Hs.291841 
Hs.182979 
I Hs.236463 K 



ESTs 

Taxi (human T-cell leukemia virus type 
ESTs 
ESTs 

gb:qh92a02.x1 Soares_NFLT_GBC_S1 Homo: 
Apg12 (autophagy 12, S. cerevisiae)-!ike 131.8 
ESTs 131.2 
ESTs 220.6 
ESTs 125.4 
predicted exon 123.4 
predicted exon 572 
ESTs 121.4 
ESTs 119.4 
hypothetical protein FLJ20080 117 
g b:qi74f02.y5 NCI_CGAP_Ov26 Homo sapiens 
Homo sapiens clone GLSH-2 similar to gli 112.8 
ESTs 112.6 
gb:np46f05.s1 NCLCGAP_Br11 Homo sapiens 



zinc finger protein 41 
ESTs 

ribosomalproteinS4,X-linked 
ESTs 
ESTs 
ESTs 



AI301041 
AW157431 
Z42313 
AI927225 



predicted exor 
ESTs 
ESTs 

gb:tj77e12.x1 
ESTs 
ESTs 
ESTs 
Hs.175610 ESTs 

predicted exon 
ESTs 
ESTs 



111 

111 

110.2 

110.2 

109.4 

109 

108.8 

108.8 

714.8 



107.8 
106.2 
l_9W_OT_PA_?_S 



Hs.206934 
Hs,134182 
Hs.135119 
Hs.194110 
Hs,129993 
Hs.194718 
Hs.188725 



hypothetical protein PRO2730 
ESTs 

zinc finger protein 265 
ESTs 
ESTs 



324315 I 
314217 1 
320932 ; 
327876 

319736 R17424 Hs.6650 vacuolar protein sorting 45B (yeast homo 

327747 predicted exon 

327844 predicted exon 

318200 AI061192 Hs.166517 ESTs 

329414 predicted exon 

318296 AI089667 Hs.270713 ESTs 121.4 

307010 AI140014 gb:qa68f09.x1 SoaresJetalJieart_NbHH19W295 

319792 AI138635 Hs.22968 ESTs 385.4 



97.6 
97.4 
97.2 
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305671 M811688 Hs.82113 dUTPpyrophosphatase 96 

329440 predicted exon 93.8 

310381 AI263059 Hs.145594 ESTs 93.4 

318824 F06771 Hs.27226 ESTs 93.4 

328957 predicted exon 92.2 

318804 Z42549 Hs.160893 ESTs 92 

330836 AA055611 Hs.226568 ESTs, Moderately similar to ALU4JHUMAN A 92 

324592 AW752437 Hs.325708 ESTs 91.8 

311820 AW274545 Hs.254333 ESTs 91.4 

321614 H86161 gb:ys94b01.r1 Soares retina N2b5HR Homo 91 

330306 predicted exon 91 

303096 AL080276 Hs.268562 regulator of G-protein signalling 17 90 

313275 A1027604 Hs.159650 ESTs 110.4 

302593 H54855 Hs.36958 ESTs 88 

321421 BE465115 Hs.171688 ESTs 86.2 

330832 AI133530 Hs.62930 ESTs 456.4 

311847 AW301807 Hs.297260 ESTs 86 

322036 BE002723 Hs.301905 Homo sapiens cDNA FU14080 fis, clone HE 145.8 

328688 predicted exon 85.6 

325251 predicted exon 85.4 

329088 predicted exon 85.4 

322524 W79027 Hs.271762 ESTs 84 

337953 predicted exon 451 

323529 AA284397 Hs.201485 Homo sapiens clone FLC0664 PR02866 mRNA, 

307041 AI144243 gb:qb85b12.x1 Soares_fetal_heartJlbHH19W 

318285 AI332454 Hs.158412 ESTs 81.4 

312021 AA759263 Hs.14041 ESTs 81 

329350 predicted exon 81 

326169 predicted exon 80.4 

338038 predicted exon 1024.2 

312549 AI214510 Hs.146304 ESTs 77.4 

312542 D60076 gb:HUM084E10ACIontech human fetal brain 75.8 

320992 AB026891 Hs.225972 solute carrier family 7, (cationic amino 76 

318596 AI470235 Hs.172698 EST 150.6 

315650 AA64S042 Hs.269615 ESTs 73.4 

324328 AA447276 Hs.292020 ESTs 210.4 

332622 R10674 Hs. 128856 CSR1 protein 70.2 

328229 predicted exon 69.4 

319110 T75260 Hs.98321 hypothetical protein FLJ14103 68.6 

316133 AI187742 Hs.125562 ESTs 308.6 

303992 AW515800 gb:hd88g01.x1 NCI_CGAP_GC6 Homo sapiens 

322675 AA017656 Hs.146580 enolase 2, (gamma, neuronal) 377.2 

325753 predicted exon 105.2 

312539 AI004377 Hs.200360 Homo sapiens CDNAFLJ13027 fis, clone NT 92.2 

302592 AA294921 Hs.250811 v-ral simian leukemia viral oncogene horn 351.6 

314578 AA410183 Hs.137475 ESTs 201.6 

335986 predicted exon 108.6 

321478 AW402593 Hs.123253 hypothetical protein FU22009 528 

305192 AA666019 gb:ag44a04.s1 Jia bone marrow stroma Horn 58.6 

304275 AA070605 gb:zm53h09.s1 Stratagene fibroblast (937 78.6 

302779 AJ235667 gb:Homo sapiens mRNA for immunoglobulin 278.8 

301976 T97905 Hs.77256 enhancer of zeste (Drosophila) homolog 2 479.2 

316021 AW293399 Hs.144904 nuclear receptor co-repressor 1 792.4 

320802 BE336699 Hs.185055 BENE protein 2423.8 

317282 AI733112 Hs.176101 ESTs 523.2 

316827 AI380429 Hs.172445 ESTs 578 

303190 BE280787 Hs.16079 hypothetical protein FLJ10233 223 

315587 AI268399 Hs.140489 ESTs 136.2 

333122 predicted exon 399 

310214 AI220072 Hs.165893 ESTs 234.4 

320089 D43945 Hs.113274 transcription factor EC 68 

309328 AW024348 Hs.233191 EST, Weakly similar to A27217 glucose tr 258.8 

318971 Z44067 Hs.10957 ESTs 376.6 

327220 predicted exon 47.4 

315757 AW014605 Hs.179872 ESTs 177.4 

320730 R68869 Hs.151072 ESTs 205.2 

313339 AI682536 Hs.163495 Homo sapiens cDNA FLJ13608 fis, clone PL 260 

318634 T49598 Hs.156832 ESTs 475.2 

320955 AW820035 Hs.278679 a disintegrin and metalloproteinasedoma 388.6 

306605 AI000497 Hs.119500 ribosomalprotein,largeP2 .81.6 

309349 AW051 91 3 gb:wx24a09.x1 NCI_CGAP_Kid1 1 Homo sapien 

306004 AA889992 Hs.2186 eukaryotictranslationelongationfactorlga 451.2 

330020 predicted exon 61.2 

302308 AW327279 Hs.91 379 ribosomal protein L26 342 

314648 AW979268 gb:EST391378 MAGE resequences, MAGP Homo 

315131 AI753709 Hs.152484 ESTs 130.4 
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313690 AI493591 Hs.78146 
333585 

312911 H93366 Hs.7567 
322966 AA633669 Hs.235920 
312492 R71072 Hs.191269 
318988 Z44203 Hs.26418 
AI123705 Hs.106932 
AI025476 Hs. 131 628 
AW205369 Hs.312830 
AA1 27984 Hs.222024 
AI829848 Hs.182937 
AA373210 Hs.43047 
AB033062 Hs.134970 
N24236 Hs.179662 
AL1 37449 Hs.1 26666 
331384 AB041035 Hs.93847 
' 300938 AA514416 Hs.152320 
312695 AW196663 Hs.200242 
320223 W35132 Hs.267442 
332743 AW247977 Hs.87595 
331039 AW378685 Hs.18625 
333123 



324181 
311717 
321342 
308852 
331466 
320279 
322221 



311735 
312953 
313055 
313291 
315059 
322284 
322450 
324803 
331495 
333610 



AA643008 Hs.192775 
AW338564 Hs.217493 
AW294416 Hs.144687 
NM_001992Hs.128087 
AW367295 Hs.241175 
AI267970 Hs.150614 
AW275110 Hs.271106 
AI792140 Hs.49265 
AL121278 Hs.25144 
AW975183 Hs.292663 
AW970939 Hs.291039 



0 X04588 Hs.85344 
3 R56151 Hs.93589 
5 AW300094 Hs.1 36252 



5 BE621807 Hs.3337 



2 AW797956 Hs.75748 
1 AA769365 Hs.1 26058 
5 BE409857 Hs.69499 



335095 
335815 
330232 
330823 
331704 
302642 
304484 
310230 
301531 
306337 
331327 



AA031565 Hs.221255 

F04225 ■ Hs.66032 
NM_016428Hs.130719 

AA432067 Hs.258373 

AK000377 Hs.144840 

AI077462 Hs.134084 

AA954221 Hs.73742 

N46436 Hs.109221 



platelet/endothelial cell adhesion molec 3179.6 

predicted exon 175.4 

Homo sapiens cDNA: FLJ21962 fis. clone H 219 

Homo sapiens cell recognition molecule C 350.2 

ESTs 322.8 

ESTs 25 

ESTs 773.4 

ESTs 634.8 

ESTs 54.2 

transcription factor BMAL2 23.4 

peptidylprolylisomeraseA(cyclophilinA) 92 

Homo sapiens cDNA FLJ13585 fis, clone PL 494 

DKFZP434N178 protein 76.2 

nucleosome assembly protein 1 -like 1 253.2 

homeo box B4 136.6 

NADPH oxidase 4 720 

ESTs, Weakly similar to 1 605244A erythro 27 

ESTs 303.8 

ESTs 189 

translocase of inner mitochondrial membr 14.4 

Mitochondrial Acyl-CoA Thioesterase 529.8 

predicted exon 396.2 

predicted exon 91.8 

predicted exon 406.4 

ESTs 413.4 

annexinA2 -30,8 

Homo sapiens cDNA FLJ12981 fis, clone NT -62.8 

coagulation factor II (thrombin) recepto -73.6 

ESTs -43.8 

ESTs, Weakly similar to ALU4_HUMAN ALU S 

ESTs -67 

ESTs -395.2 

ESTs -1.6 

ESTs 4.4 

ESTs -282.8 

predicted exon -152.6 

predicted exon -23.2 

predicted exon -331.2 

neurotrophic tyrosine kinase, receptor, 591.2 
Homo sapiens mRNA; cDNA DKFZp564B1162 (f 

ESTs 135 

predicted exon 727.4 
ESTs, Weakly similar to P4HA_HUMAN PROLY 

hypothetical protein FLJ 10408 304 

predicted exon 109.2 

transmembrane 4 superfamily member 1 414.8 

predicted exon 87.8 

predicted exon 379.8 

proteasome (prosome, macropain) subunit, 589.2 

ESTs -87 

hypothetical protein 347.4 

predicted exon -1182 

predicted exon 106.4 

predicted exon -156 

predicted exon 102.6 
ESTs, Moderately similar to ALU5_HUMAN A -62 

ESTs -14.6 

NESH protein 267.6 

ESTs 85 

homolog of mouse C2PA -70 
ESTs 



,P0 



ESTs 

predicted exon 

322796 W31178 Hs.154140 Homo sapiens ovary-specific acidic prate 



-33,4 



315363 
302032 
313140 



T60843 Hs.189679 
AW1 34529 Hs.244647 
AA759190 Hs.121454 



BE265133 Hs.217493 
AW015920 Hs.161359 
AI952430 Hs.150614 



6 ESTs 
predicted exon 
ESTs 
ESTs 

ESTs, Weakly similar to olfactory recept 
coagulation factor II (thrombin) recepto ■< 
annexin A2 9 
ESTs -: 
ESTs, Weakly similar to ALU4_HUMAN ALU S 
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328520 






predicted exon 


-1092 


0.2 


302406 


NM_012099Hs.211956 


CD3-epsilon-associated protein; antisens 


10 


0.2 


311804 


AI866921 


Hs.203349 


Homo sapiens cDNA FLJ12149 fis, clone MA 


-252.6 


0.2 


315065 


AK001122 


Hs.1 05859 


hypothetical protein FLJ 10260 


-46.2 


0.2 


314129 


AA228366 


Hs.115122 


ESTs 


-308.8 


0.2 


335697 






predicted exon 


-47.2 


0.2 


335989 






predicted exon 


89 


0.2 


320606 


AW867943 


Hs.127216 


hypothetical protein FLJ13465 


-205.6 


0.2 


329745 






predicted exon 


103 


0.2 


313628 


AW419069 


Hs.209670 


ESTs 


-177.8 


0.2 


334616 






predicted exon 


-936.6 


0.2 


308820 


AI821267 


Hs.207243 


EST 


-7.2 


0.2 


320416 


AI026984 


Hs.293662 


ESTs 


-18.4 


0.2 


335211 






predicted exon 


-142 


0.2 


323629 


AA375957 


Hs.6682 


ESTs 


-100 


0.1 


331420 


AW452904 




gb:UI-H-BI3-aly-h-1 1 -0-Ul.s1 NCI_CGAP_Su 


83 


0.1 


315984 


AI015862 


Hs.131793 


ESTs 


-250.6 


0.1 


332833 






predicted exon 


-374.2 


0.1 


332607 


NM_002314Hs.36566 


LIM domain kinase 1 


-27.6 


0.1 


313467 


AA004879 


Hs.1 87820 


ESTs 


-288.2 


0.1 




AV651680 


Hs.208558 


ESTs 


-735.6 


0.1 


330775 
333168 


AW247020 


Hs.250747 


SUMO-1 activating enzyme subunit 1 


53.6 
-1041.8 


0.1 
0.1 


332079 


AI308876 


Hs.103849 


predicted exon 
ESTs 


19.4 


0.1 


322724 


AF161442 


Hs.191591 


Homo sapiens HSPC324 mRNA, partial cds 


-123.6 


0.1 


303652 


AI799111 


Hs.64341 


ESTs 


-46.4 


0.1 


303131 


AW081061 


Hs.103180 


DC2 protein 


-156.4 


0.1 


320716 


AI479439 


Hs.171532 


ESTs 


-146.6 


0.1 


300454 


AA659037 


Hs.1 63780 


ESTs 


-304 


0.1 


312757 


AI285970 


Hs.1 8381 7 


ESTs 


445 


0.1 


312391 


R43707 


Hs.133159 


ESTs, Weakly similar to PIHUSD salivary 


-111.8 


0.1 


308877 


AI832519 




gb:at69h03.x1 Barstead colon HPLRB7 Homo-149.6 


0 


311275 


AI659166 


Hs.207144 


ESTs 


-62.6 


0 


302363 


AW163799 Hs.198365 


2,3-bisphosphoglycerate mutase 


-15 


0 


321717 


AW956580 


Hs.42699 


ESTs 


-1059,6 


0 


302638 


AA463798 


Hs.102696 


MCT-1 protein 


-332.2 


0 


306352 


AA961367 




gb:or52a05.s1 NCI_CGAP_GC3 Homo sapiens 


21.8 


313798 


AI292148 


Hs.71622 


SWI/SNF related, matrix associated, acti 


-97.2 


0 


320807 


AA1 35370 


Hs.1 88536 


Homo sapiens cDNA: FLJ21635 fls, clone C 


-2222 


0 


320931 


AW262836 


Hs.252844 


ESTs 


-881.6 


0 


332450 


AW288085 


Hs.1 11 56 


hypothetical protein 


28.4 


0 


332535 


AF167706 


Hs.19280 


cysteine-rich motor neuron 1 
predicted exon 


-722 


0 


335990 






421 


0 


330746 


AB033888 


Hs.8619 


SRY (sex determining region Y)-box 18 


35.4 


0 


316820 


AI627912 


Hs.130783 


Forssman synthetase 


-373.6 


0 


337429 






predicted exon 


-257 


0 


331192 


BE622021 


Hs.1 52571 


ESTs, Highly similar to IGF-II mRNA-bind 


-33 


0 


330609 


AI346201 


Hs.76118 


ubiquitin carboxyl-terminal esterase L1 


-280 


0 


323593 


AI739435 


Hs.39168 


ESTs 


-3627.6 


0 


302704 


AA531133 


Hs.4253 


hypothetical protein MGC2574 


-278.6 


0 


330534 


NM_004579Hs.82979 


mitogen-aciivating protein kinase kinase 


-244 


0 




X91195 


Hs.1 00623 


phospholipase C, beta 3, neighbor pseudo 


-1204.2 




333221 






predicted exon 


-189.6 


0 


335988 






predicted exon 


-122.6 


0 


330574 


AI984144 


Hs.66713 


hepatitis delta antigen-interacting prot 


-2257.4 


0 


312052 


BE621697 


Hs.14317 


nucleolar protein family A, member 3 {HI 




0 


319568 


AF131781 


Hs.84753 


hypothetical protein FLJ12442 


-874!6 


0 


337113 






predicted exon 


-24.6 


0 


335149 






predicted exon 


-191.8 


0 
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TABLE 6A 



Table 6A shows the accession numbers for those pkeys lacking unigenelD's for Table S. The pkeys in Table 7 lacking unigenelD's are represented within 
Tables 1 -6A. For each probeset we have listed the gene cluster number from which the oligonucleotides were designed. Gene clusters were compiled 
using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 
Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the "Accession" 
column. 

Pkey; Unique Eos probeset identifier number 

CAT number: Gene cluster number 
Accession: Genbank accession numbers 



320925 1525201J D62892 D79755 D62760 

321614 87866J H86161 AA054308 AA018955 

313952 136885 1 F20956 M129374 AA133740 AW819878 

314648 293660J AW979268 AA87841 9 AA431 342 AA431 628 

302749 458J07 M16951 M16952 M16948 M16949 M16950 

312362 764066 1 AW015994 R39898 AW000978 AI593202 AI521706 

312542 1522649J D60076 D60259 D61037 

312642 1005225.1 AW052128 H51439 H51481 

312986 171879J AA211586 F35799 AA211641 F29720 AW937387 AW937408 

329350 c_x_hs 

329414 c_y_hs 

329440 c_y_hs 

329451 c_y_hs 

338033 CH22_6528FG_LINK_EM:AC00 

338038 CH22_6535FG_LINK_EM:AC00 

338116 CH22_6650FG_UNK_EM:AC0O 

338158 CH22_6700FG_LINKEM:AC00 

329732 c14_p2 

329745 c14_p2 

308106 AI476803 

329863 c14_p2 

338316 CH22_6944FG__UIMK_EM:AC0O 

308248 AI560919 

338388 CH22_7034FG_UNK_EM:AC00 

338442 CH22_7109FG_UNhLEM:ACOO 

338645 CH22_.7410FG_LINK_EM:AC00 

338728 CH22_7527FG__LINK_EM:AC00 

308877 AI832519 

338962 CH22 7838FG_LINK_DJ32I10 

308886 AI833240 

333120 CH22_349FG_8L3_LINK_EM:A 

333121 CH22_350FG_81_4_UNK_EM:A 

333122 CH22_351FG_81_6_UNK_EM:A 

333123 CH22_352FG_81_7_UNK_EM:A 

333168 CH22 400FG_94 1_LINK_EM:A 

333169 CH22_401FG_94_2_LINK_EM:A 
333221 CH22_458FG_105_1_LINK_EM: 
326077 c17_hs 

326080 c17_hs 

326169 c17_hs 

326198 c17_hs 

326230 c17_hs 

333585 CH22_845FG_203_4_LINK_EM: 

333610 CH22_871FG_217_5_UNK_EM: 

335093 CH22_2423FG_492_3_LINK_EM 

335095 CH22_2425FG_492_5_LINK_EM 

335149 CH22_2484FG 499_5_LINK_EM 

326759 c20_hs 

333977 CH22 1254FG_309_6_LINK_EM 

326783 c20_hs 

335211 CH22_2550FG_511_2JJNK_EM 

305192 AA666019 

303973 AW512014 

303992 AW515800 

326945 c21_hs 

328229 c_6_hs 

328262 c_6_hs 
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328418 c_7 hs 
328455 c_7_hs 

335697 CH22_3058FG_596_12_LINK_E 
328520 c 7_hs 
5 328548 c_7_hs 

335815 CH22_3187FG_618_3_LINK_EM 
328688 c_7_hs 
328695 c_7 hs 
307010 AI140014 
10 337113 CH22_5058FG_493_1_ 
307041 AI144243 
328700 c_7_hs 

335946 CH22_3324FG_646_20_LINK_D 
335986 CH22_3366FG_654_10_LINK_D 
15 335987 CH22_3367FG_654_11_UNK_D 

335988 CH22_3368FG_654_12_LINK_D 

335989 CH22_3369FG_655_2_LINK_DJ 

335990 CH22_3370FG_655_4_LINK_DJ 
337214 CH22_5288FG_613_7_ 

20 330020 c16_p2 

305989 AA888220 

328857 c_7_hs 

328937 c_8_hs 

328957 c_8_hs 
25 330187 c_4_p2 

337407 CH22_5607FG_755_1_ 

337429 CH22_5633FG_762_3_ 

330232 c_5_p2 

307414 AI242106 
30 330305 c 7_p2 

330306 C 7_p2 

337603 CH22_5896FG_LINK_C20H12. 
337953 CH22_6395FG_LINK_EM:AC0O 
339236 CH22_8181FG_LINK_BA354I1 
3 5 339403 CH22J384FG_LINK_BA232E1 
309349 AW051913 
325222 o10_hs 
325251 c10_hs 

318188 956161J AI792566 AI053836 AI054127 AI792489 AI288324 
40 309871 AW300366 
325544 c12_hs 
309931 AW341683 

332833 CH22_50FG_17_7_UNK_C20H1 

302779 33837J AJ235667 AJ235666 AJ235564 AJ235555 AJ235668 AJ235669 AJ235670 
45 302790 34168J AJ245245 AJ245247 AJ245257 AJ245248 AJ245254 AJ245256 AJ245253 AJ245203 AJ245250 AJ245252 AJ245243 AJ245204 
AJ245201 AJ245206 AJ245246 AJ245255 AJ245205 AJ245202 AJ245251 AJ245249 AJ245207 AJ245244 
332961 CH22_185FG_48_18_UNK_EM: 
325753 c14_hs 



325889 c16_hs 

304261 AA059387 

304275 AA070605 

334376 CH22_1670FG_379 8 LINK_EM 

5 5 327220 c_1_hs 

304363 AA206045 

334458 CH22_1757FG_391_2_LINK_EM 

327365 c_1_hs 

327373 c_2_hs 

60 334616 CH22_1923FG_41 1_1 5_LINK_E 

327414 c_2_hs 

327568 o_3_hs 

336034 CH22_3419FG_678_5_LINK_DJ 

336059 CH22J445FG 684_2_LlNKLDJ 

65 334834 CH22_2148FG_439_3_UNK_EM 

304782 AA582081 

304876 M595765 

327747 o_5_hs 

336228 CH22_3626FG_730_4_LINK_DA 

70 329073 c_x_hs 

329088 c_x_hs 

304969 M614406 

327844 c_5_hs 

327876 c_6_hs 

75 306352 AA961367 

331131 genbank_R54797 R54797 



WO 02/079492 
331139 genbank_R65706 R65706 

331420 675963 J AW452904 AW449414 BE467906 AI293565 BE549932 BE326357 F04362 
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TABLE 6B 

Table 6B shows the genomic positioning for those pkeys lacking unigene ID's and accession numbers in Table 6. The pkeys in Table 7 lacking 
unigenelD's are represented within Tables 1-6B. For each predicted exon, we have listed the genomic sequence source used for prediction. Nucleotide 
locations of each predicted exon are also listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et al." refers to the publication 

entitled "The DNA 

sequence of human chromosome 22." Dunham I. et al., Nature (1999) 402:489-495. 
Strand; Indicates DNA strand from which exons were predicted. 

NLposltlon: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

332961 Dunham, I. el.al. 

333221 Dunham, I. et.al. 

333585 Dunham, I. et.al. 

333610 Dunham, I, et.al, 

334376 Dunham, I. et.al. 

334458 Dunham, I. et.al. 

334616 Dunham, I. etal. 

335149 Dunham, I. etal. 

335211 Dunham, I. et.al. 

335697 Dunham, I. et.al, 

335986 Dunham, I. etal, 

335987 Dunham, I. etal. 

335988 Dunham, I. etal, 

335989 Dunham, I. etal. 

335990 Dunham, I. etal. 



Strand Ntposition 



2521424-2521555 

3978070-3978187 

6234778-6234894 

6547007-6547116 

13902218-13902331 

14353496-14353572 

15176123-15176470 

21497441-21497587 

21774611-21774680 

25481456-25481649 

27967791-27967852 

27971413-27971481 

27977912-27978013 



29014404-29014590 



335093 Dunham, I. et.al. 

335095 Dunham, I. 

335815 Dunham, I. etal. 

335946 Dunham, I. etal. 

336059 Dunham, I. etal. 

336228 Dunham, I. et.al, 

337113 Dunhi 

337214 Dunhi 

337407 Dunham, I. etal. 

337429 Dunham, I. et.al. 

337603 Dunham, I. etal. 

338116 Dunham, I. etal. 

338158 Dunham, I. etal. 

338388 Dunham, I. etal. 

338645 Dunham, I. et.al. 

338728 Dunham, I. etal. 

339236 Dunham, I. etal. 

339403 Dunham, I. etal. 

325222 6525287 

325251 6682448 

325544 6682452 

325753 6682474 

329745 6065779 

329732 6065783 

329863 6691797 

325889 5867087 



Minus 
Minus 

Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 
Minus 



Minus 
Minus 
Minus 



3310817-3310749 



3730864-3730767 



17182681-17182535 
21297367-21297214 
21292546-21292381 
26320518-26320421 
27487203-27487035 
29184079-29183969 
30904602-30904497 
21233344-21233237 



24063839-24063775 

25949039-25948927 

32773355-32773202 

34050728-34050625 

22332-22473 

411693411751 

171228-171286 
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325843 6552453 

330020 6671887 

326198 5867215 

326230 5867230 

326169 5867255 

326077 6682495 

326080 6682495 

326759 6249610 

326788 6682503 

326946 6004446 

327036 6531965 

327220 5867525 

327365 6552412 

327414 5867750 

327373 5867792 

327568 5867811 

330187 6706138 

327747 5867947 

327844 6249582 

330232 6013526 

328229 5868105 

327876 5868140 

328262 6381906 

328688 5868262 

328700 5868264 

328695 5868264 

328418 5868409 

328455 5868431 

328520 5868477 

328548 5868487 

328857 6381927 

330305 4877982 

330306 4877982 
328937 5868500 
328957 6456773 
329073 5868596 
329088 5868608 
329350 6456785 
329414 5868874 
329440 5868885 
329451 5868887 



Minus 7126-7232 

Plus 172397-172491 

Minus 80295-80574 

Minus 301868-301972 

Minus 128321-128388 

Minus 312108-312168 

Plus 478644478847 

Plus 97216-97311 

Plus 277132-277335 

Minus 116677-116967 

Plus 319951-320040 

Minus 65701-65781 

Minus 118133-118198 

Plus 102461-102586 

Minus 8186-8742 

Minus 4615246287 

Plus 212923-213020 

Plus 115322-115498 

Minus 18895-18958 

Plus 113655-113830 

Minus 120936-121053 

Plus 103882-104034 

Plus 11867-12027 

PIUS 626030-626094 

Plus 764089-764203 

Plus 318632-318695 

Minus 258811-258894 

Plus 385576-385633 

Plus 1942075-1942246 

Plus 72301-72397 

Minus 80557-81051 

Minus 52269-52365 

Plus 96161-96233 

Minus 1448241-1448333 

Plus 219195-219297 

Plus 37838-37956 

PIUS 116738-116950 

Plus 98911-98969 

Plus 942555-942643 

PIUS 21943-22063 

PIUS' 25974-26048 
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TABLE 7: 



iy, and ExAccn for all of the sequences in Table 8. Seq ID No links the nucleic acid and protein 



Pkey: 


Unique Eos probeset identifier number 




ExAccn: 


Exemplar 


Accession number, 


Genbank accession number 




UnigenelD: 


Unigene number 






Unigene Title: Unigene gene title 






Seq.ID.No.: Sequence Identification Number found in Table 8 








Unigene ID 


Unigene Tiltle 






BE246154 


Hs.154210 


endothelial differentiation, sphingolipi 


Seq ID 1 & 2 


115819 


AA486620 


Hs.41135 


endomucin-2 


Seq ID 3 & 4 




NM 002205 


Hs.149609 


integrin, alpha 5 (fibronectin receptor, 






A1016712 


Hs.287797 


integrin, beta 1 (fibronectin receptor, 


Seq ID 7 & 8 








matrix metalloproteinase 1 0 (stromelysin 


Seq ID 9 & 10 




AW338625 


Hs.22120 




Seq ID 11 & 12 




NM 005397 


Hs.16426 




Seq ID 13 & 14 


102024 


AA301867 


Hs.76224 


EGF-containing fibulin-like extracellula 




102024 


AA301867 


Hs.76224 




Seq ID 17 & 18 




X68264 


Hs.211579 


melanoma cell adhesion molecule 




103036 


M13509 


Hs.83169 


matrix metalloproteinase 1 (interstitial 




104865 








Seq ID 23 & 24 




H93366 




Homo sapiens cDNA: FLJ21962 fis, clone H 


Seq ID 25 & 26 


109001 


AI056548 


Hs.72116 


hypothetical protein FLJ20992 similar to 


Seq ID 27 & 28 


104764 


AI039243 


Hs.278585 


ESTs 


Seq ID 29 & 30 


133200 


AB037715 


Hs.183639 


hypothetical protein FU10210 


Seq ID 31 &32 


105263 


AW388633 


Hs.6682 


solute carrier family 7, (cationic amino 


Seq ID 33 & 34 


102892 


BE440042 


Hs.83326 


matrix metalloproteinase 3 (stromelysin 


Seq ID 35 & 36 




AW956580 


Hs.42699 




Seq ID 37 &38 


110906 


M035211 






Seq ID 39 & 40 




BE245360 


Hs.279477 




Seq ID 41 &42 


132050 


AI267615 


Hs.38022 


ESTs 


Seq ID 43 & 44 




NM 001290 




LIM domain binding 2 


Seq ID 45 & 46 




AW161552 


Hs.83381 




Seq ID 47 & 48 






Hs 211587 




Seq ID 49 & 50 




C18356 


Hs.295944 


tissue factor pathway inhibitor 2 


Seq ID 51 & 52 




H94997 






Seq ID 53 & 54 


11 mil 


N75620 


Hs 43157 


ESTs 


Seq ID 54 & 55 




M21305 






Seq ID 56 & 57 




AA515902 


Hs.130650 




Seq ID 58 & 59 




AB018259 


Hs. 1 18140 


K1AA0716 gene product 


Seq ID 60 & 61 




AW298600 


Hs. 141 840 


ESTs, Weakly similar to S59501 interfero 


Seq ID 62 & 63 


313556 


AA628517 


Hs.1 18502 




Seq ID 64 & 65 




AW751201 


Hs.51233 




Seq ID 66 & 67 




AL040178 


Hs 142003 




Seq ID 68 & 69 




AF056085 


Hs 198612 




Seq ID 70 &71 


iniW 


NM 005795 




cateitonin"receDtor-l : ke 


Seq ID 72 & 73 




AA296520 


Hs 89546 




Seq ID 74 &75 


103850 


AA187101 


Hs!213194 


hypothetical protein MGC10895 


Seq ID 76 & 77 


133260 


AA403045 


Hs.6906 


Homo sapiens cDNA: FLJ23197 fis, clone R 


Seq ID 78 & 79 


101097 


BE245301 


Hs.89414 


chemokine(C-X-C motif), receptor 4 (fus 


Seq ID 80 &81 


104786 


AA027167 


Hs.10031 


KIAA0955 protein 


Seq ID 82 & 83 


132173 


X89426 


Hs.41716 


endothelial cell-specific molecule 1 


Seq ID 84 & 85 


100420 


D86983 


Hs.1 18893 


Melanoma associated gene 


Seq ID 86 & 87 


111018 


AI287912 


Hs.3628 


mitogen-activated protein kinase kinase 


Seq ID 88 & 89 


108507 


AI554545 


Hs.68301 


ESTs 


Seq ID 90 & 91 


104894 


AF065214 


Hs.18858 


phospholipase A2, group IVC (cytosolic, 


Seq ID 92 & 93 


118511 


N75620 


Hs.43157 


ESTs 


Seq ID 94 & 95 


125609 


AA868063 


Hs.104576 


carbohydrate (keratan sulfate Gal-6) sul 


Seq ID 96 & 97 


101543 


M31166 


Hs.2050 


pentaxin-related gene, rapidly induced b 


Seq ID 98 8,99 


102241 


NM 007351 


Hs.268107 


multimerin 


Seq ID 100 & 101 


101560 


AW958272 


Hs.347326 


intercellular adhesion molecule 2 


Seq ID 102 & 103 


103280 


U84722 


Hs.76206 


cadherin 5, type 2, VE-csdherin (vascula 


Seq ID 104 & 105 


105826 


AA478756 


Hs.194477 


E3 ubiquitin ligase SMURF2 


Seq ID 106 & 107 


102804 


NM 002318 


Hs.83354 


lysyl oxidase-like 2 


Seq ID 108 & 109 


131647 


AA359615 


Hs.30089 


ESTs 


Seq ID 110 & 111 


103095 


NM 005424 


Hs.78824 


tyrosine kinase with immunoglobulin and 


Seq ID 112 & 113 


103037 


BE018302 


Hs.2894 


placental growth factor, vascular endoth 


Seq ID 114 & 115 


100405 


AW291587 


Hs.B2733 


nidogen 2 


Seq ID 116 & 117 


102012 


BE259035 


Hs.1 18400 


singed (Drosophila)-like (sea urchin fas 


Seq ID 1 18 & 119 
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101261 


D30857 


Hs.82353 


protein C receptor, endothelial (EPCR) 


Seq ID 120 & 121 


105729 


H46612 


Hs.293815 


Homo sapiens HSPC285 mRNA, partial cds 


Seq ID 122 & 123 


107216 


D51069 


Hs.211579 


melanoma cell adhesion molecule 


Seq ID 124 & 125 


131080 


NMJ01955 


Hs.2271 


endothelin 1 


Seq ID 126 & 127 


131486 


F06972 


Hs,27372 


BMX non-receptor tyrosine kinase 


Seq ID 128 & 129 


134299 


AW580939 


Hs.97199 


complement component C1q receptor 


Seq ID 130 & 131 


134983 


D28235 


Hs.196384 


prostaglandin-endoperoxide synthase 2 (p 


Seq ID 132 & 133 


115827 


AA428000 


Hs.283072 


actin related protein 2/3 complex, subun 


Seq ID 134 & 135 


133614 


NMJ03003 


Hs.75232 


SEC14 (S. cerevisiae)-llke 1 


Seq ID 136 & 137 


116483 


AI346201 


Hs.76118 


ubiquitin carboxyl-terminal esterase L1 


Seq ID 138 & 139 


132546 


M24283 


Hs.168383 


intercellular adhesion molecule 1 (CD54) 


Seq ID 140 & 141 


133678 


AW247252 


NA 


nucleoside phosphorylase 


Seq ID 142 & 143 


130184 


H58306 


Hs.15165 


retinoicacid induced 14 


Seq ID 144 & 145 


134786 


T29618 


Hs.89640 


TEK tyrosine kinase, endothelial (venous 


Seq ID 146 & 147 


129371 


X06828 


Hs.110802 


von Willebrand factor 


Seq ID 148 & 149 


418506 


AA084248 


Hs.85339 


G protein-coupled receptor 39 


Seq ID 150 & 151 


322262 


AA632012 


Hs.188746 


ESTs 


Seq ID 152 & 153 


312173 


AI821409 


Hs.304471 


EST 


Seq ID 154 & 155 


319795 


AB037821 


Hs.146858 


pratocadherin 10 


Seq ID 156 & 157 


313978 


AI870175 


Hs.13957 


ESTs 


Seq ID 158 & 159 


306840 


AI077477 


Hs.307912 


ESTs 


Seq ID 160 & 161 


310272 


AF216389 


Hs.148932 


sema domain, transmembrane domain (TM), 


Seq ID 162 & 163 


310272 


AF216389 


Hs.148932 


sema domain, transmembrane domain (TM), 


Seq ID 164 & 165 


315044 


BE547674 


Hs.204169 


ESTs, Weakly similar to S65657 alpha-1 C- 


Seq ID 166 & 167 


321325 


AB033100 


Hs.300646 


KIAA1274 protein (similar to mouse palad 


Seq ID 168 & 169 


321325 


AB033100 


Hs.300646 


KIAA1274 protein (similar to mouse palad 


Seq ID 170 & 171 


303251 


AF240635 


Hs. 115897 


pratocadherin 12 


Seq ID 172 & 173 


302378 


AL109712 


Hs.296506 


Homo sapiens mRNA full length insert cDN 


Seq ID 174 & 175 


319267 


F11802 


Hs.6818 


ESTs 


Seq ID 176 & 177 


310442 


AW072215 


Hs.208470 


ESTs 


Seq ID 178 & 179 


300469 


BE301708 


Hs.233955 


hypothetical protein FLJ20401 


Seq ID 180 & 181 


331237 


W87874 


Hs.25277 


Homo sapiens cDNA FLJ10717 fis; clone NT 


Seq ID 182 & 183 


330968 


R44557 


Hs.23748 


ESTs 


Seq ID 184 & 185 


301822 


X17033 


Hs.271986 


integrin, alpha 2 (CD49B, alpha 2 subuni 


Seq ID 186 & 187 


422573 


AW297985 


Hs.295726 


integrin, alpha V (vitronectin receptor 


Seq ID 188 & 1B9 


133061 


AI186431 


Hs.296638 


prostate differentiation factor 


Seq ID 190 & 191 


135235 


AW298244 


Hs.266195 


ESTs 


Seq ID 192 & 193 


101192 


BE247295 


Hs.78452 


solute carrier family 20 (phosphate tran 


Seq ID 194 & 195 


113195 


H83265 


Hs.8881 


ESTs, Weakly similar to S41044 chromosom 


Seq ID 196 & 197 


101741 


NM 003199 


Hs.326198 


transcription factor 4 


Seq ID 198 & 199 


321911 


AF026944 


Hs.293797 


ESTs 


Seq ID 200 & 201 


320635 


N50617 


Hs.80506 


small nuclear ribonucleoprotein polypept 


Seq ID 202 & 203 


326230 






NM_017643:Homo sapiens hypothetical prot 


Seq ID 204 & 205 


132968 


AF234532 


Hs.61638 


myosin X 


Seq ID 206 & 207 


135073 


W55956 


Hs.94030 


Homo sapiens mRNA; cDNA DKFZp586E1624 (f 


Seq ID 208 & 209 


108937 


AL050107 


Hs.24341 


transcriptional co-activator with PDZ-bi 


Seq ID 210 & 211 


116430 


AK001531 


Hs.66048 


hypothetical protein FU10669 


Seq ID 212 & 213 






Hs.22968 


Homo S3piens clone !MAGE:451939, mRNA se 


Seq ID 214 & 215 


122697 


AA420683 


Hs.98321 


hypothetical protein FLJ14103 


Seq ID 216 & 217 


112522 


R68857 


Hs.265499 


ESTs 


Seq ID 218 & 219 


304782 


AA582081 




gb:nn32h08.s1 NCI_CGAP_Gas1 Homo sapiens 


Seq ID 220 & 221 


312802 


AA544669 


Hs.193042 


ESTs 


Seq ID 222 & 223 


302680 


AW192334 


Hs.38218 


ESTs 


Seq ID 224 & 225 


326198 






Phase 2 & 3 Exons 


Seq ID 226 & 227 


331019 


NM_006033 


Hs.65370 


lipase; endothelial 


Seq ID 228 & 229 
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TABLE 8 



Nucleic Acid Accession #: NM_001400 

Coding sequence: 244-2208 (underlined sequences correspond to start and stop codons) ) 



GTCGGGGGCA 
CTTCGCCCTG 
C AC AAAAAG C 
CGCCCTCTAG 
ACCATGGGGC 
GTCAACTATG 



ATCCTGGAGA 
ATGTACTATT 
GCTAACCTGC 
CGGGAAGGGA 
ATTGAGCGCT 
CTCTTCCTGC 
ATGGGCTGGA 
AAGCACTATA 
CTGTACTGCA 
AACATTTCCA 
ATCGTCCTGA 
GTGGGCTGCA 
GCTGTGCTCA 
CGGGCCTTCA 
TTCAAGCGAC 
CACCCCCAGA 
TCTTCTTCCT 
CCACCCCAGT 
CAAGCCAGAG 
TAGAGTTAGT 
TATATATTCT 
AGCTCCTAAA 
TCTTTGTCTG 

GTGTGCACTT 
TTCATACCCC 
CTGGGGTTGT 
TGGGAAGATG 
CATGTAAGCG 
CATCTTTTCA 
AAGCCCACTT 
AGCAAAACAA 
AAATGAGTCT 
TCTTGTGTGA 
CTTGATTTTT 
GTTAACTTTT 
CGCCAGAACT 
ACAAAGAATA 
AGATGTCTTG 
TTTGCACATA 



11 
I 

GCAGCAAGAT 
CTTGAGCGAG 
CTGGATCACT 
CGTTCGTCTG 
CCACCAGCGT 
ATATCATCGT 
ACAGCATTAA 
ACATCTTTGT 
TTATTGGCAA 
TCTTGTCTGG 
GTATGTTTGT 
AT AT C AC AAT 
TAATCAGCGC 
ACTGCATCAG 



GCTGCGGTTT 
CATCGAACCA 
GAGTAGCGCC 
CCCGCTGGTC 
CCGGCATTAC 
ACTGACCTCG 
CTTGCTGACC 
TCTGGCCCTC 
GGCCACCACC 
GGCCCTGTCA 
GCTGAAAATG 



GAATCTACTC 
AGGCCAGCCG 
GCGTCTTCAT 
AGGTGAAGAC 
ACTCCGGCAC 
TCCGGATCAT 
CCATCATCGC 



TGCGCTGTCC 
CACCACGGTC 
CTTGGTCAGG 
CAGCTCTGAG 
CGCCTGCTGG 
CTGTGACATC 
CAACCCCATC 
G1CCTGCTGC 



TCCTGTGAAC 
ACCCCCCTGG 
GGGTTCATTT 
GAGCTTTGAG 
CTGCTTCTTT 
TCCTCAACGT 



GGACAACCCA 
GCTGTCCACC 
AAATCTCTGS 
GGAGAATACG 
AATGCACTGG 
AGCTTTGATT 
GGCCCCTCCT 
GAGATGTTTT 
AGGGATGCCC 
TCTTTTACTT 
ATCATCTATA 



31 

I 

CCGTACAGAT 
CCGAGGCCCT 
CCCCTGAAGC 
ACCCCGGCTT 
AAGGCCCACC 
AACTACACGG 
GTGGTGTTCA 
ATTTGGAAAA 
TCAGACCTGT 
TACAAGCTCA 
GCCTCCGTGT 
AAACTCCACA 
ATCTCCCTCA 
AGCTGCTCCA 
TTCACTCTGC 
ACTCGGAGCC 
AAGTCGCTGG 
GCACCGCTCT 
CTCTTCAGAG 
ATTTACACTC 
AAGTGCCCGA 
TTCAGCCGCA 
GAGACCATTA 
CACCGGAAGC 
GCTTCGACTC- 
AACAGCCTGG 
GAAGGGTGGA 
TTGCACTGAG 
CAAAGACTAA 



CCTGGGGACA 
GCAGCTCGGT 
GAAAGCTGAA 
TTCTCATCTG 
CCAAGAAATT 
TGGCAGGAGT 



CCGAACGCAA 
GGAAAAGCTA 
TCTCTCGCCT 
CAGGGTTGGC 
CTCTGACTAC 
TATCAGCGCG 
CTGCTTTATC 
CCACCGACCC 
AGCCTACACA 
GTGGTTTCTG 
CGCCATCGCC 



ACGGGAGCAA 
TCCTGGGTGG 
CCGTGCTGCC 
TTCTGCTCTC 
GCCGCCTGAC 
CGCTGCTCAA 
TCATCCTGCT 
CGGAGTACTT 
TGACCAACAA 
GCGGAGACTC 
GCAAATCGGA 
TGTCTTCTGG 
GCTCTTTACT 
CTGCCAGGGA 
TGGTGTCGGG 



CCTGCCTATC 
GCTCTACCAC 
CATCGTCATT 
GTTCCGCAAG 
GACCGTAATT 
CCTGCTGGAT 
CCTGGTGTTA 
GGAGATGCGT 
TGCTGGCAAA 
CAATTCCTCC 
AAACGTCAAC 
TGGTCGCTGG 
G3AGCTGCTG 



TGTACATCCC 
TATACTTTAA 
GCAAATAGGC 



CCAAAGGTCT AGCATTGTCA 
TGTCCCCATG TGAAAGCGTC 
AGTTTCAAAC CCAAGTGAGT 
ACACCCCACC CTCCCTTCCC 
CTACCTGAGA GTTATCAGAG 



AAGATGGTTT GGAGGTGTAA AACAATGTCC 



TTTGGAATTT 
TTACCATTTC 
ATATTAGCCA 
GAATGGATTA 
ACATCCGTCT 
GCAACAACAT 
GTTTCAGGAA 
CCCTCTTGTG 
GCTATTCATT 
ACTGTCTCTT 
AAGAATAGTA T 
GCTTTATCAA CTTTTAAACA T 



ATGAAATGTG 
TATCTAAATG 
AGTGAAAACC 
AACAAATATG 
TTCATTTCAA 
GAATGTATTT 
CTAGAATCCA 

AAAATATATT 



GGTTGAAGTC 
ATATCCATTG 
GGATCCTTGG 



TTCGCTGAGG 
ACTTTGATTT 
AAGCCGAAAT 



TTCCCACTTT 
GTTGTATTTT 
GAAGTCATTT 
CCCTTAAGCA 
AGATAGTAAT 



GAAACAGACA 
ATTTCTTAGC 
TATTTCAGAA 
AAGTACTTTT 
TCTAACCCGT 
TGGTAGGGAA 
TATAAATATT 
TTAAACCGAG 



TTACTTTAAC T 
TGAAGATATG T 
TTCAGTGCAA T 
TTCTGACTTT TGTGGATCAT 270 0 

GATTTTTTTA A 



I AAG 



21 



31 



41 



31 



I I I I I I 

MGPTSVPLVK AHRSSVSDYV NYDIIVRHYN YTGKLNISAD KSNSIKLTSV VFILICCFII 
LENIFVLLTI WKTKKFHRPM YYFIGNLALS DLLAGVAYTA NLLLSGATTY KLTPAQWFLR 
EGSMFVALSA SVFSLLAIAI ERYITMLKMK LHNGSNNFRL FLLISACWVI SLILGGLPIM 
GWNCISALSS CSTVLPLYHK HYIIiFCTTVF TLLLLSIVIL YCRIYSLVRT RSRRLTFRKN 
ISKASRSSEK SLALLKTVII VLSVFIACWA PLFILLLLDV GCKVKTCDIL FRAEYFLVLA 
VLNSGTNPII YTLTNKEMRR AFIRIMSCCK CPSGDSAGKF KRPIIAGMEF SRSKSDNSSH 
PQKDEGDNPE TIMSSGNVNS SS 
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Seq ID NO: 3 Nucleotide sequence: 
Nucleic Acid Accession #: NM_016242 
Coding sequence: 79-864 (underlined 



d start and stop codons) ) 



; CAGCTTGGGA 



ACAACAAAAC 
CCAACAACTG 
ATGTCAACAG 
GTCAGGAAGA 
GCTGTTTCAA 
ACAGAAATAC 
ACCTCAATAC 
GGTGGAAAAA 
GTGGTTATTG 
ATGTGCTGGA 
AAAGAGAGCG 
GCACAAGGAA 
TACGCTTAAT 
AATCCCGACT 



GTAACAGCAC 
CATCTATAAC 
GAACAACTCC 
CTACTTTTTT 
ATGACTCCAT 
CATTACAAAG 
CAGGTAGTGT 
CAGTTACAAT 
ATGCAAGCAC 
CTTTGATTGT 
AGGCAGATCC 
TGAAGCTTCT 
AAACCAAGAA 
CTTCAGCTTC 
TCCATACCTG 



GGGAATTGTC 
GGAACTGCTT 
AGGTGTTTTA 
AACACCAAAC 



AACAAGTAAA 
CATTTCAAAC 
TTCCAAACCC 
TCTACAACCA 
TCCAGAAAAC 
TTCAGCAACC 
AATAACACTT 
GGGCACACCA 
TACCGTTAAG 
CTGACAGCTT 
TATGCACCAA 
CTGCTGG 



31 

I 

CCTGCCTGCT 
CAAGTGACCA 
GAGGCAGCTA 
ACAGAATCAT 
ATCACCAATG 
GATGAAGGAT 
GTAACAGTAA 
AAGACTGAAA 
GATGCATCAC 
ACCTCACAGT 
AGCCGGTCTT 
TCAGTATTTG 
GAAAATGGAA 
ACAATTTCTC 
GAGGAATTCT 
GCGTGGAAAA 



41 
I 

TCTGGAGAAA 
TTCTTTTTCT 

ATAATTCACT 
TACAGAAAAA 
AATTACTTAA 
TGAAAGCCAC 
CAAGTGTTAC 
CTCAGAGTTC 
CTTCTAAAAC 
CTCAAGTAAT 
ATTCCAGTAT 
TTCTGGTGGG 
ATGATCAACC 
ATGAGTCTGG 
CTCCACACCT 
GGAGAAAGTC 



GAAGATATTG 60 

TCTGCCCAGT 12 0 

TGTTGTTACT 18 0 

TGTTGTCACA 24 0 

AATGTCTCTG 300 

AACCACTGAT 360 

ACTTCCCAAT 420 

AAT TAAAAC A 480 

TGGTACATTA 54 0 

AGACACTGAG 600 

TATTTTGCCG 660 

TTTGTACCGA 72 0 

TCAGTCTGAT 780 

TGAGCACTCT 84 0 

AGGCAATAAT 900 

CTGCAGAATC 960 



Seq ID NO: 4 Protein sequence: 
Protein Accession tf: NP_057326 



11 



31 



I 



41 



MELLQVTILF LLPSICSSNS TGVLBAANNS LWTTTKPSI TTPNTESLQK NWTPTTGTT 
PKGTITNELL KMSLMSTATF LTSKDEGIiKA TTTDVRKNDS IISKTVTVTSV TLPNAVSTLQ 
SSKPKTETQS SIKTTEIPGS VLQPDASPSK TGTLTSIPVT IPENTSQSQV IDTEGGKNAS 
TSATSRSYSS IILPWIALI VITLSVFVLV GLYRMCWKAD PGTPENGNDQ PQSDKESVKL 
LTVKTISHES GEHSAQGKTK N 



Seq ID NO: 5 nucleotide sequence: 



: and stop codons) 



CAGGACAGGG 
TGCAGCTGCG 
TGCCGCCGCC 
CGGGGCCCCC 
GGGTCAGTGT 
GTGGTGCTGT 
TTGACAGCAA 
CTGTGGAGTA 
TCTTGGCATG 
TGGGCACCTG 
GCTCAGATTT 
TCACCAAGAC 
TCCTGTCTGC 
TGGTTCAGGG 
GATACTCTGT 
TGCCCAAAGG 
CCCTCTACAA 
CAGACGTCAA 
GGACCCCTGA 
CCGGCATAGA 
GCAGCTCCTT 
GGGCTCCCTT 



I 

AAGAGCGGGC 
CTGGGGCCCC 
ACCCAGGGTC 
GGGCTCCTTC 
GCTGGTGGGA 
CTACCTCTGT 
AGGCTCTCGG 
CAAGTCCTTG 
CGCTCCACTG 
CTACCTCTCC 
CAGCTGGGCA 
TGGCCGTGTG 
CACTCAGGAG 
GCAGCTGCAG 
GGCTGTTGGT 
GAACCTCACT 
CTTCTCAGGG 
TGGGGACGGG 
CGGGCGGCCT 
GCCCACGCCC 
GACCCCCCTG 



CGGCGCCGAC 



GCACCCAAGG 
CCTTGGGGTG 
CTCCTGGAGT 
CAGTGGTTCG 



AC AGATAAC T 
GCAGGACAGG 
GTTTTAGGTG 
CAGATTGCAG 
ACTCGCCAGG 
GAATTCAGTG 
TACGGCTATG 
GAACAGATGG 
CTGGATGACT 
CAGGAGGTGG 
ACCCTTACCC 
GGGGACCTGG 
ACCCAGCAGG 



CCCCGCTCGT 
ACTTAGACGC 
CAGTGGAGTT 
CTAATACCAG 
CCAGCCCCAC 
CCTCACTGTC 
GGGCAACAGT 
GCACAGAGAA 
TCACCCGAAT 
GTTACTGCCA 
GACCAGGAAG 
AATCTTATTA 
CCAGTTCCAT 
GTGATGACAC 
TCACCATCCT 
CCTCCTACTT 
TGCTGGTGGG 
GCAGGGTCTA 
TCACTGGCCA 
ACCAGGATGG 
GAGTAGTGTT 



41 

I 

- AGAGTCCCCT 
GCCGCTGCTG 
GGAGGCCCCA 
TTACCGGCCG 
CCAGCCAGGA 
ACAGTGCACC 
CAGCTCAGAG 
TCGAGCCCAT 
GGAGCCACTG 
TCTGGAGTAT 
AGGAGGCTTC 
CTATTTCTGG 
CCCCGAGTAC 
CTATGATGAC 
AGAAGACTTT 
TAATGGCTCA 
TGGCTATGCA 
GGCACCCCTG 
CGTCTACCTG 
TGATGAGTTT 
CTACAATGAT 
TGTATTTCCT 



51 

I 

CTCCACGCCG 
TTGCTGCTCG 
GCAGTACTCT 
GGAACAGACG 
GTGCTGCAGG 
CCCATTGAAT 



GGCTCCTCCA 
AGCGACCCCG 
GCACCCTGCC 
AGTGCCGAGT 
CAAGGCCAGA 
CTGATCAACC 
AGCTACCTAG 
GTTGCTGGTG 
GACATTCGAT 
GTGGCCGCCA 
CTCATGGATC 
CAGCACCCAG 
GGCCGATTTG 
GTGGCCATCG 



1020 
1080 
1140 
1200 
1260 
1320 
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GAGGGCTGGG CTCTAAGCCT TCCCAGGTTC TGCAGCCCCT GTGGGCAGCC AGCCACACCC 1380 

CAGACTTCTT TGGCTCTGCC CTTCGAGGAG GCCGAGACCT GGATGGCAAT GGATATCCTG 1440 

ATCTGATTGT GGGGTCCTTT GGTGTGGACA AGGCTGTGGT ATACAGGGGC CGCCCCATCG 1500 

TGTCCGCTAG TGCCTCCCTC ACCATCTTCC CCGCCATGTT CAACCCAGAG GAGCGGAGCT 1560 

5 GCAGCTTAGA GGGGAACCCT GTGGCCTGCA TCAACCTTAG CTTCT3CCTC AATGCTTCTG 1620 

GAAAACACGT TGCTGACTCC ATTGGTTTCA CAGTGGAACT TCAGCTGGAC TGGCAGAAGC 1680 

AGAAGGGAGG GGTACGGCGG GCACTGTTCC TGGCCTCCAG GCAGGCAACC CTGACCCAGA 1740 

CCCTGCTCAT CCAGAATGGG GCTCGAGAGG ATTGCAGAGA GATGAAGATC TACCTCAGGA 1800 

ACGAGTCAGA ATTTCGAGAC AAACTCTCGC CGATTCACAT CGCTCTCAAC TTCTCCTTGG 1360 

10 ACCCCCAAGC CCCAGTGGAC AGCCACGGCC TCAGGCCAGC CCTACATTAT CAGAGCAAGA 1920 

GCCGGATAGA GGACAAGGCT CAGATCTTGC TGGACTGTGG AGAAGACAAC ATCTGTGTGC 1980 

CTGACCTGCA GCTGGAAGTG TTTGGGGAGC AGAACCATGT GTACCTGGGT GACAAGAATG 2 040 

CCCTGAACCT CACTTTCCAT GCCCAGAATG TGGGTGAGGG TGGCGCCTAT GAGGCTGAGC 2100 

TTCGGGTCAC CGCCCCTCCA GAGGCTGAGT ACTCAGGACT CGTCAGACAC CCAGGGAACT 2160 

15 TCTCCAGCCT GAGCTGTGAC TACTTTGCCG TGAACCAGAG CCGCCTGCTG GTGTGTGACC 2220 

TGGGCAACCC CATGAAGGCA GGAGCCAGTC TGTGGGGTGG CCTTCGGTTT ACAGTCCCTC 2280 

ATCTCCGGGA CACTAAGAAA ACCATCCAGT TTGACTTCCA GATCCTCAGC AAGAATCTCA 2340 

ACAACTCGCA AAGCGACGTG GTTTCCTTTC GGCTCTCCGT GGAGGCTCAG GCCCAGGTCA 2400 

CCGTGAACGG TGTCTCCAAG CCTGAGGCAG TGCTATTCCC AGTAAGCGAC TGGCATCCCC 2460 

20 GAGACCAGCC TCAGAAGGAG GAGGACCTGG GACCTGCTGT CCACCATGTC TATGAGCTCA 2520 

TCAACCAAGG CCCCAGCTCC ATTAGCCAGG GTGTGCTGGA ACTCAGCTGT CCCCAGGCTC 2580 

TGGAAGGTCA GCAGCTCCTA TATGTGACCA GAGTTACGGG ACTCAACTGC ACCACCAATC 2640 

ACCCCATTAA CCCAAAGGGC CTGGAGTTGG ATCCCGAGGG TTCCCTGCAC CACCAGCAAA 2700 

AACGGGAAGC TCCAAGCCGC AGCTCTGCTT CCTCGGGACC TCAGATCCTG AAATGCCCGG 2760 

25 AGGCTGAGTG TTTCAGGCTG CGCTGTGAGC TCGGGCCCCT GCACCAACAA GAGAGCCAAA 2820 

GTCTGCAGTT GCATTTCCGA GTCTGGGCCA AGACTTTCTT GCAGCGGGAG CACCAGCCAT 2880 

TTAGCCTGCA GTGTGAGGCT GTGTACAAAG CCCTGAAGAT GCCCTACCGA ATCCTGCCTC 2940 

GGCAGCTGCC CCAAAAAGAG CGTCAGGTGG CCACAGCTGT GCAATGGACC AAGGCAGAAG 3000 

GCAGCTATGG CGTCCCACTG TGGATCATCA TCCTAGCCAT CCTGTTTGGC CTCCTGCTCC 3060 

30 TAGGTCTACT CATCTACATC CTCTACAAGC TTGGATTCTT CAAACGCTCC CTCCCATATG 3120 

GCACCGCCAT GGAAAAAGCT CAGCTCAAGC CTCCAGCCAC CTCTGATGCC TGAGTCCTCC 3180 

CAATTTCAGA CTCCCATTCC TGAAGAACCA GTCCCCCCAC CCTCATTCTA CTGAAAAGGA 3240 

GGGGTCTGGG TACTTCTTGA AGGTGCTGAC GGCCAGGGAG AAGCTCCTCT CCCCAGCCCA 3300 

GAGACATACT TGAAGGGCCA GAGCCAGGGG GGTGAGGAGC TGGGGATCCC TCCCCCCCAT 3360 

35 GCACTGTGAA GGACCCTTGT TTACACATAC CCTCTTCATG GATGGGGGAA CTCAGATCCA 3420 

GGGACAGAGG CCCAGCCTCC CTGAAGCCTT TGCATTTTGG AGAGTTTCCT GAAACAACTG 3480 

GAAAGATAAC TAGGAAATCC ATTCACAGTT CTTTGGGCCA GACATGCCAC AAGGACTTCC 3540 

TGTCCAGCTC CAACCTGCAA AGATCTGTCC TCAGCCTTGC CAGAGATCCA AAAGAAGCCC 3600 

CCAGTAAGAA CCTGGAACTT GGGGAGTTAA GACCTGGCAG CTCTGGACAG CCCCACCCTG 366 0 

40 GTGGGCCAAC AAAGAACACT AACTATGCAT GGTGCCCCAG GACCAGCTCA GGACAGATGC 372 0 

CACAAGGATA GATGCTGGCC CAGGGCCAGA GCCCAGCTCC AAGGGGAATC AGAACTCAAA 378 0 

TGGGGCCAGA TCCAGCCTGG GGTCTGGAGT TGATCTGGAA CCCAGACTCA GACATTGGCA 384 0 

CCAATCCAGG CAGATCCAGG ACTATATTTG GGCCTGCTCC AGACCTGATC CTGGAGGCCC 390 0 

AGTTCACCCT GATTTAGGAG AAGCCAGGAA TTTCCCAGGA CCTGAAGGGG CCATGATGGC 396 0 

45 AACAGATCTG GAACCTCAGC CTGGCCAGAC ACAGGCCCTC CCTGTTCCCC AGAGAAAGGG 402 0 

GAGCCCACTG TCCTGGGCCT GCAGAATTTG GGTTCTGCCT GCCAC-CTGCA CTGATGCTGC 408 0 

CCCTCATCTC TCTGCCCAAC CCTTCCCTCA CCTTGGCACC AGACACCCAG GACTTATTTA 414 0 

AACTCTGTTG CAAGTGCAAT AAATCTGACC CAGTGCCCCC ACTGACCAGA ACTAGAAAAA 420 0 




Seq ID NO: 6 Protein sequence: 
Protein Accession #: NP_002196.1 

55 1 11 21 31 41 51 

I I I I I I 

MGSRTPESPL HAVQLRWGPR RRPPLVPLLL LLVPPPPRVG GFNLDAEAPA VLSGPPGSFF 60 

GFSVEFYRPG TDGVSVTjVGA PKANTSQPGV LQGGAVYLCP WGASPTQCTP IEFDSKGSRL 12 0 

LESSLSSSEG EEPVEYKSLQ WFGATVRAHG SSILACAPLY SWRTEKEPLS DPVGTCYLST 180 

60 DNFTRILEYA PCRSDFSWAA GQGYCQGGFS AEFTKTGRW LGGPGSYFWQ GQILSATQEQ 24 0 

IAESYYPEYL INLVQGQLQT RQASSIYDDS YLGYSVAVGE FSGDDTEDFV AGVPKGNLTY 300 

GYVTILNGSD IRSLYNFSGE QMASYFGYAV AATDVNGDGL DDLLVGAPLL MDRTPDGRPQ 3 60 

EVGRVYVYLQ HPAGIEPTPT LTLTGHDEFG RFGSSLTPLG DLDQDGYNDV AIGAPFGGET 420 

QQGWFVFEG GPGGLGSKPS QVLQPLWAAS HTPDFFGSAL RGGRDLDGNG YPDLIVGSFG 4 80 

65 VDKAWYRGR PIVSASASLT IFPAMFNPEE RSCSLEGNPV ACINLSFCLN ASGKHVADSI 540 

GFTVELQLDW QKQKGGVRRA LFLASRQATL TQTLLIQNGA REDCREMKIY LRNESEFRDK 600 

LSPIHIALNF SLDPQAPVDS HGLRPALHYQ SKSRIEDKAQ ILLDCGEDNI CVPDLQLEVF 660 

GEQNHVYLGD KNALNLTFHA QNVGEGGAYE AELRVTAPPE ASYSGLVRHP GNFSSLSCDY 72 0 

FAVNQSRLLV CDLGNPMKAG ASLWGGLRFT VPHLRDTKKT IQFDFQILSK NLNNSQSDW 780 

70 SFRLSVEAQA QVTLNGVSKP EAVLFPVSDW HPRDQPQKEE DLGPAVHHVY ELINQGPSSI 84 0 

SQGVLELSCP QALEGQQLLY VTRVTGLNCT TNHPINPKGL ELDPEGSLHH QQKREAPSRS 900 

SASSGPQILK CPEAECFRLR CELGPLHQQE SQSLQLHFRV WAKTFLQREH QPFSLQCEAV 960 

YKALKMPYRI LPRQLPQKER QVATAVQWTK AEGSYGVPLW IIILAILFGL LLLGLLIYIL 1020 
YKLGFFKRSL PYGTAMEKAQ LKPPATSDA 
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Seq ID NO: 7 Nucleotide sequence: 
Nucleic Acid Accession #: NM_002211 

Coding sequence: 104.. 2500 (underlined sequences correspond t 



start and stop codons) 



ATGTTTAAAA 
GTGGTGCACA 
TTTAGAAGCC 
CAAAGATATA 
CAAGCCAGAG 
GGAGCCACAG 
CTACCTTATG 
AACAGATCTG 



GCAAATGCCA 
AATTCAACAT 
TTAAAAAAGA 
AAGAAAAATA 
GATATTACTC 
ACATTTACAT 
GACCTGTCTT 



TTGCACAAGT 
TAATAAAGGA 
TTCTCCAGAA 
CTGGAGGAAT 
AGATGGGAAA 
TATGTACACA 
GAGTGAAAAT 
GGAGCTGAAA 
TGTAATTCAG 
CGGCAAATTG 
TGGAACAGGG 
TGAAATTAGC 
GCCTCTGGGC 
CCAAAGCGAA 
TGGCGCGTGC 
AGTTAACAGT 
TAACAATGGA 
AATTTATTCT 
AATTTGTGGA 
TGGCAGTGCA 
CTGCAATGGC 
AGGGCAAACG 
TGTTCAGTGC 
CTATTTTAAC 
TCCTGTGTCC 
AGTGAATGGG 
TCCAGACATC 
ATTACTGCTG 
TGAAAAGGAG 
CGTAACAACT 
ACAACACTGA 
TTGCCATGGT 
TGTTTTATTA 
ACAGGATGGT 
GACTATTGAA 
AGTTAAAGTA 
CTTTACATAT 
AAGGATTGTT 
CTGAAAGACA 
AAAGGCCATG 
GTGCCATTTT 
AGCCTTAGAT 
ACATACAGGC 
TTTATTATTT 
TGTATTTGTT 
AGCGGTCAAT 
TCACCTACGC 
CACCATTTGA 
GTTTTAACAG 



AAGACTGTGA 
GAACAGAACT 
GAAGTATTTA 
GGTGGTTTCG 
GTTACACGGC 
CTTGGTGGCA 



AATATTCAGA 
AACTTGATCC 
TTGATCATTG 
TCAGAAGGAG 
GAAAATGGAA 
ATAACTTCAA 



21 

I 

3 ATAGGGAAGA 
3 GCCCCGGACG 
GTTCAGTTTG 
AATCATGTGG 
TTTTACAGGA 
AGGGTTGCCC 
AAAATGTAAC 
AGATCCAACC 
TAAAATTCAA 
ATTCAATGAA 
TGAGGAGGAT 
TGCCTTACAT 
GCACCACCCC 
ATGAACTTGT 
ATGCCATCAT 
TGCTGGTGTT 
TTGTTTTACC 
ATTATGATTA 
CAATTTTTGC 



ACAGCACCCC 



CTGTGTGTTT 
AGAATGTATA 
AGGAATGCCT 
TCCAGATGAC 
CAACCGTAGC 
ACAGCAGTTG 
GAGAGCTGAA 
AGACGATTTG 



TAGCACAACA 
ATTTAGCTAC 
TGGAAAACAG 



ATGCATACAA 
TAACAATAAG 
GAAAATGTTC 
ATAAGTGTCC 



TTCCACAGAT 
AAATGATGGA 
TCCTTCTATT 
AGTTACTGAA 
AGTAGGAACA 
TTCCCTTTCC 
TTACAAATCT 
CAATATTTCC 



41 
I 

' GGCGCCGATT 
AAGATGAATT 
GCTCAAACAG 
CAAGCAGGGC 
ACTTCTGCAC 
AT AG AAAAT C 
AAAGGAACAG 
GTTTTGCGAT 
GACTATCCCA 
GAGAATGTAA 
TTCAGAATTG 
CCAGCTAAGC 
AAAAATGTGC 
CGCATATCTG 
GTTTGTGGAT 
GCCGGGTTTC 
CAATGTCACC 



TTTACGGAGG AAGTAGAGGT TATTCTTCAG 



GGCATCCCTG 
AGGTGCAATG 
GAAGACATGG 



AAAGTCCCAA 
AAGGGCGTGT 
ATGCTTACTG 



GGCAAATTCT 
GGAAATGGTG 
TGTGACTGTT 
CGGGGCATCT 
TGTGAGATGT 
AGAGCCTTCA 
ATTACCAAGG 
CATTGTAAGG 



CTTTGGATAC 
GCGAGTGTGG 
GTCAGACCTG 
ATAAAGGAGA 
TAGAAAGTCG 



TGGTAGACAT 
CAGGAAAGAA 
TGTTTGTAGG 
TAATTTCAAC 
TCGTGTGTGT 
TAGTACTTGT 
TGTCTGTAAG 



TTATCTGCAA 
TCAGAAGTCA 
TACTGCAAGA 
ATTGGAGATG 
TCTGACAGCT 
TACATCTGTG 
GGAAATGGGA 
TGTGAATGCA 
AACAGTTCAG 



I 

GCCGTACCAA 
TACAACCAAT 
ATGAAAATAG 
CAAATTGTGG 
GATGTGATGA 
CCAGAGGCTC 
CAGAGAAGCT 
TAAGATCAGG 
TTGACCTCTA 
AAAGTCTTGG 
GATTTGGCTC 
TCAGGAACCC 
TCAGTCTTAC 
GAAATTTGGA 
CACTGATTGG 
ACTTTGCTGG 
TGGAAAATAA 
TCCAGAAACT 
CTGTTTACAA 



TGTGATAGAT 
GAGTGCAACC 
G AAG C C AG C A 



ATATGGAAGC 
AAAATGAATG 
GTGGTCAATC 
ATGCAAAGTA 
TTTACTCATG 
TTTTGAAAAT 
TATTCTTGTC 
ATCAAGCTTA 
ATGAGCATGA 
GTCAGTTTGC 
TTAAATCTGT 
AGTATGTTGA 
GGAAAAATTC 
AAGAGTTACT 
AAAAGAACCG 
CATACTTTAC 
TTATTTTGTT 
TGCAATTTTG 
TTGCCTTTTT 
GAGTCTTACT 
GTTGCCCATC 



TCATGGTTCA 
TAGCTGGTGT 
TTTTAATGAT 
CCAAATGGGA 
CGAAGTATGA 
GCAATTTCCA 
TGCAGGTTTT 
AATGTTGTAA 
AGCTAAGGTC 
TTGGATTAAG 
TGAGAGTTTC 



AAAGAAAGAC 
GGACAAATTA 
TGACGACTGT 
TGTTGTGGAG 
GGTTGCTGGA 
AATTCATGAC 
CACGGGTGAA 



ACATGCACAC 
CCCCAGCCGG 

TGGTTCTATT 



TATTTTGCTA 
GAGTTGCTGG 
AGAGAGTTAG 
TAATGTTTGG 
AGCAATTTTC 
AAAGTATTTG 
TAATGTCTGG 
GGGTAAGACT 
AATGAACATG 
TTGAGTTAGT 
TTGTTTCACA 



TAGTCACAGT 
GAAAATGTAC 
TTCATGCCAG 
ACATTGTGCC 
TGATATTTCT 
TGTTAATCAT 
AATCCAAAGT 
TTTGCCTGTT 
TGTAAAATAC 
GAAGGAAAAA 
TAACTTTTAT 
TGCTAAAAAG 
CTGAATGGGG 
TGCTTTCTAT 
TTTTTATGAG 
TGAAGTTATA 
GCCATAACAG 
CTAGTCACAT 



ATTGTTCTTA 
AGAAGGGAGT 
AATCCTATTT 
GTACTGCCCG 
TAGGTAGCTT 
AAT AT GT ATA 



AGGTTCAATT 
TTAAAATTAG 
AATGTGAATG 
CATTTGAGTG 
GCACAGATGA 
AAATCTGCAG 
ATACAAATGA 
CCAATGGCTT 
CCAACTACAC 
ACGGACAGAT 
CGAAGTTTCA 
ATAAAGAATG 
AGGAATGTTC 
TCCAACCTGA 
TTACGTATTC 
GTCCCACTGG 
TTGGCCTTGC 
TTGCTAAATT 



TTTTTGACCT 
ATAGCGATTG 
GTATTAAAAC 
AAATGTCCTG 
AGACATGACT 
GTTTGAAATA 
CCAATAGCTT 
GCCTTCACTT 
TCCTTGATTT 
ACCTTTTGAG 
CACCTCTTCT 
TACTTTTTCT 
CTGTGGCTAT 
ACCACTGTAT 
TCTTGTTTTA 



TGCAAATCCC 
TAGGGCAATA 
ATTTTTAAAA 
AAGACTTGAG 
TTTCTTCCTG 
AAAGGGCAAT 
TGATTTTTAG 
CTAGCTAGTT 
GATGACATAT 
GTTGATCTAC 
TAAAACCTGT 
TACAAATTCA 
AGCACTATTT 
TTGAATTTAT 
AATCTTTTAA 
TTGAAGTTTT 
GCAACAGCTC 
GTTTACTTCT 
AGTGCCTTTA 



1030 
1140 
1200 



TTTTGGAAAA 126 



1320 
1330 
1440 
1500 
1560 
1620 
1S80 
1740 
1800 
18S0 
1920 



2400 
24S0 
2520 
2530 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3050 
3120 
31S0 
3240 
3300 
3360 
3420 
3480 
3540 
3600 



II I I I I 

MNLQPIFWIG LISSVCCVFA QTDENRCLKA NAKSCGECIQ AGPNCGWCTN STFLQEGMPT 
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SARCDDLEAL KKKGCPPDDI ENPRGSKDIK KNKNVTNRSK GTAEKLKPED ITQIQPQQLV 120 

LRLRSGEPQT FTLKPKRAED YPIDLYYLMD LSYSMKDDLE NVKSLGTDLM NEMRRITSDF 180 

RIGFGSFVEK TVMPYISTTP AKIiRNPCTSE QNCTSPFSYK NVLSLTNKGE VFNELVGKQR 240 

ISGNLDSPEG GFDAIMQVAV CGSLIGWRNV TRLLVFSTDA GFHFAGD3KL GGIVLPNDGQ 300 

CHLENNMYTM SHYYDYPSIA HLVQKLSENN IQTIFAVTEE FQFVYKELKN LIPKSAVGTL 3S0 

SANSSNVIQU IIDAYNSLSS EVILENGKLS EGVTISYKSY CKNGVNGTGE NGRKCSNISI 420 

GDEVQFEISI TSNKCPKKDS DSFKIRPLGF TEEVEVILQY ICECECQSEG IPESPKCHEG 4 80 

NGTFECGACR CNEGRVGRHC ECSTDEVNSE DMDAYCRKEN SSEICSNNGE CVCGQCVCRK 540 

RDNTNEIYSG KFCECDNFNC DRSNGLICGG NGVCKCRVCE CNPNYTGSAC DCSLDTSTCE SOO 

ASNGQICNGR GICECGVCKC TDPKFQGQTC EMCQTCLGVC AEHKECVQCR AFNKGEKKDT SSO 

CTQECSYFNI TKVESRDKLP QPVQPDPVSH CKEKDVDDCW FYFTYSVNGN NEVMVHWEN 720 

PECPTGPDII PIVAGWAGI VLIGLALLLI WKLLMIIHDR REFAKFEKEK MNAKWDTGEN 780 
PIYKSAVTTV WPKYEGK 

Seq ID NO: 9 Nucleotide sequence: 
Nucleic Acid Accession #:NM_002425 

Coding sequence: 23.. 1453 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

AAAGAAGGTA AGGGCAGTGA G AATG ATGCA 
AGTCTGCTCT GCCTATCCTC TGAGTGGGGC 
TGCCCAGCAA TACCTAGAAA AGTACTACAA 
AAAGGACAGT AATCTCATTG TTAAAAAAAT 
GGTGACAGGG AAGCTAGACA CTGACACTCT 
TCCTGACGTT GGTCACTTCA GCTCCTTTCC 
TACATACAGG ATTGTGAATT AT ACAC CAG A 
TGAGAAAGCT CTGAAAGTCT GGGAAGAGGT 
AGGAGAGGCT GATATAATGA TCTCTTTCGC 
TGATGGCCCA GGACACAGTT TGGCTCATGC 
TATTCACTTT GATGATGATG AAAAATGGAC 
CGTTGCTGCT CATGAACTTG GCCACTCCCT 
TTTGATGTAC CCACTCTACA ACTCATTCAC 
TGATGTGAAT GGCATTCAGT CTCTCTACGG 
GGTGCCCACA AAATCTGTTC CTTCGGGATC 
GTCCTTCGAT GCCATCAGCA CTCTGAGGGG 
TTGGCGAAGA TCCCACTGGA ACCCTGAACC 
CTCTCTTCCA TCATATTTGG ATGCTGCATA 
TTTTAAAGGA AATGAGTTCT GGGCCATCAG 
AGGCATCCAT ACCCTGGGTT TTCCTCCAAC 
CAAGGAAAAG AAGAAAACAT ACTTCTTTGC 
TAGCCAGTCC ATGGAGCAAG GCTTCCCTAG 
GCCTAAGGTT GATGCTGTAT TACAGGCATT 
ACAGTTTGAG TTTGACCCCA ATGCCAGGAT 
GTTACATTGC TAGGCGAGAT AGGGGGAAGA 
ATTATTCATC TAATGTATTA TGAGCCAAAA 
GAAGAAGATG AGCCTTGCAG ATATCTGCAT 
ACTTGCTTTT GAATTGCACT GAACAGAATT 
ATGTATTTTC ATAGATGTGT TATTACTTCC 



TCTTGCATTC CTTGTGCTGT TGTGTCTGCC SO 

AGCAAAAGAG GAGGACTCCA ACAAGGATCT 120 

CCTCGAAAAG GATGTGAAAC AGTTTAGAAG 180 

CCAAGGAATG CAGAAGTTCC TTGGGTTGGA 240 

GGAGGTGATG CGCAAGCCCA GGTGTGGAGT 3 00 

TGGCATGCCG AAGTGGAGGA AAACCCACCT 360 

TTTGCCAAGA GATGCTGTTG ATTCTGCCAT 420 

GACTCCACTC ACATTCTCCA GGCTGTATGA 480 

AGTTAAAGAA CATGGAGACT TTTACTCTTT 540 

CTACCCACCT GGACCTGGGC TTTATGGAGA 600 

AGAAGATGCA TCAGGCACCA ATTTATTCCT 660 

GGGGCTCTTT CACTCAGCCA ACACTGAAGC 720 

AGAGCTCGCC CAGTTCCGCC TTTCGCAAGA 780 

ACCTCCCCCT GCCTCTACTG AGGAACCCCT 840 

TGAGATGCCA GCCAAGTGTG ATCCTGCTTT 900 

AGAATATCTG TTCTTTAAAG ACAGATATTT 960 

TGAATTTCAT TTGATTTCTG CATTTTGGCC 1020 

TGAAGTTAAC AGCAGGGACA CCGTTTTTAT 1080 

AGGAAATGAG GTACAAGCAG GTTATCCAAG 1140 

CATAAGGAAA ATTGATGCAG CTGTTTCTGA 12 00 

AGCGGACAAA TACTGGAGAT TTGATGAAAA 12 60 

ACTAATAGCT GATGACTTTC CAGGAGTTGA 1320 

TGGATTTTTC TACTTCTTCA GTGGATCATC 13 80 

GGTGACACAC ATATTAAAGA GTAACAGCTG 1440 

CAGATATGGG TGTTTTTAAT AAATCTAATA 1500 

TGGTTAATTT TTCCTGCATG TTCTGTGACT 1560 

GTGTCATGAA GAATGTTTCT GGAATTCTTC 1620 

AAGAAATACT CATGTGCAAT AGGTGAGAGA 1680 

TCAATAAAAA GTTTTATTTT GGGCCTGTTC 1740 



I 

MMHIiAFLVLL 
KKIQGMQKFL 
TPDLPRDAVD 
AHAYPPGPGL 
SFTELAQFRL 
LRGEYLFFKD 



I 



21 



31 



51 



FPRLIADDFP 



GLEVTGKLDT 
SAIEKALKVW 
YGDIHFDDDE 
SODDVNGIQS 
RYFWRRSHWN 
YPRGIHTLGF 
GVEPKVDAVL 



I I I I 

SGAAKEEDSN KDLAQQYLEK YYKLEKDVKQ FRRKDSNLIV 
DTLEVMRKPR CGVPDVGEFS SFPGMPKKRK TKLTYRIVNY 
EEVTPLTFSR LYEGEADIMI SFAVKEHGDF YSFDGPGHSL 
KWTEDASGTN LFLVAAHELG HSLGLFHSAN TEALMYPLYN 
LYGPPPASTE EPLVPTKSVP SGSEMPAKCD PALSFDAIST 
PEPEFHLISA FWPSLPSYLD AAYEVNSRDT VFIFKGNEFW 
PPTIRKIDAA VSDKEKKKTY FFAADKYWRF DENSQSMEQG 
QAFGFFYFFS GSSQFEFDPK ARMVTHILKS NSWLHC 



Seq ID NO: 11 Nucleotide sequence: 
Nucleic Acid Accession #: XM_0581B9 

Coding sequence: 169.. 774 {underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GAAGACCAGC TCAGCTCTTC AGTTGTTGAT CATTGTCTAT TGTTCTCCAA ACAGTAAACC 60 
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AGTATTTCAC ACTGAGATTG TCGGCTGCGG GTATATTCCA ATTCCCCGTC TCCTCATGAA 120 

TATGAAGTGA AGGGCTCTGA CCCTGGAAGT GGTTCTAAGC AGGGCftA AAT G GGGTCTCGG ISO 

AAGTGTGGAG GCTGCCTAAG TTGTTTGCTG ATTCCGCTTG CACTTTGGAG TATAATCGTG 240 

AACATATTAT TGTATTTCCC GAATGGGCAA ACTTCCTATG CATCCAGCAA TAAACTCACC 3 00 

AACTACGTGT GGTATTTTGA AGGAATCTGT TTCTCAGGCA TCATGATGCT TATAGTAACA 3 SO 

ACAGTTCTTC TGGTACTGGA GAATAATAAC AACTATAAAT GTTGCCAGAG TGAAAACTGC 420 

AGCAAAAAAT ATGTGACACT GCTGTCAATT ATCTTTTCTT CCCTCGGAAT TGCTTTTTCT 4 80 

GGATACTGCC TGGTCATCTC TGCCTTGGGT CTTGTCCAAG GGCCATATTG CCGCACCCTT 540 

GATGGCTGGG AGTATGCTTT TGAAGGCACT GCTGGACGTT TCCTTACAGA TTCTAGCATA 600 

TGGATTCAGT GCCTGGAACC TGCACATGTT GTGGAGTGGA ACATCATTTT ATTTTCCATT 6 SO 

CTCATAACCC TCAGTGGGCT TCAAGTGATC ATCTGCCTCA TCAGAGTAGT CATGCAACTA 720 

TCCAAGATAC TGTGTGGAAG CTATTCAGTG ATCTTCCAGC CTGGAATCAT TTGAATAAGG 730 

ACAAAATGTT TTCCATTATC AAGACATGGC CATCTATCTA AATATTATAT CAACTGTGTA 840 

GACTTGAGGG CAATATTGAA ATGATGGTGC TTTCTGCATT TGGTGTTTAT TTGTAAAAAA 900 

TTTGCAGTCC TCACTGCACA TGCAAGTATA CCACCCTTCC ATTTAGTATG TTTTTTAAGT 9SO 

AATATGCATC AGAAACTTCA GAAATACTTC TGCCCTTTGA TCAAACAAAT CCATTTCCAA 1020 

GAATCTGTAC TAGGGAAGTA AATAAGAATA TGAGAGAAAC CTTTATGCAA ATATGTATAT 1080 

TGCAACATTA TTTAATATTC TGGAAAATTG GAAACACCCC AAAATTCTAA ACTCAGAGGA 1140 

AGGATTAAGT AAAGAGTGGT ACATACTGTA AATGTTTTCT GATATTAAAA AAAAAATTAA 12 00 
ATAAAAAATA AAGAGTACTA CATGGTTGTA AAA 

Seq ID NO : 12 Protein sequence: 
Protein Accession #: XP_058189 



1 11 21 31 41 51 

I I I I I I 

MGSRKCGGCL SCLLIPLALW SIIVNILLYF PNGQTSYASS NKLTNYVWYF EGICFSGIMM 
LIVTTVLLVL ENNNNYKCCQ SENCSKKYVT LLSIIFSSLG IAFSGYCLVI SALGLVQGPY 
CRTLDGWEYA FEGTAGRFLT DSSIWIQCLE PAHWEWNII LFSILITLSG LQVIICLIRV 
VMQLSKILCG 2YSVIFQPGI I 



Seq ID NO: 13 Nucleotide sequence: 
Nucleic Acid Accession #: NM005397 

Coding sequence: 251.. 1837 (underlined sequences correspond to start and stop codons) 



AAACGCCGCC CAGGACGCAG CCGCCGCCGC CGCCGCTCCT CTGCCACTGG CTCTGCGCCC 60 

CAGCCCGGCT CTGCTGCAGC GGCAGGGAGG AAGAGCCGCC GCAGCGCGAC TCGGGAGCCC 120 

CGGGCCACAG CCTGGCCTCC GGAGCCACCC ACAGGCCTCC CCGGGCGGCG CCCACGCTCC 18 0 

TACCGCCCGG ACGCGCGGAT CCTCCGCCGG CACCGCAGCC ACCTGCTCCC GGCCCAGAGG 24 0 

CGACGACACG ATGCGCTGCG CGCTGGCGCT CTCGGCGCTG CTC-CTACTGT TGTCAACGCC 300 

GCCGCTGCTG CCGTCGTCGC CGTCGCCGTC GCCGTCGCCG TCGCCCTCCC AGAATGCAAC 360 

CCAGACTACT ACGGACTCAT CTAACAAAAC AGCACCGACT CCAGCATCCA GTGTCACCAT 420 

CATGGCTACA GATACAGCCC AGCAGAGCAC AGTCCCCACT TCCAAGGCCA ACGAAATCTT 4 80 

GGCCTCGGTC AAGGCGACCA CCCTTGGTGT ATCCAGTGAC TCACCGGGGA CTACAACCCT 540 

GGCTCAGCAA GTCTCAGGCC CAGTCAACAC TACCGTGGCT AGAGGAGGCG GCTCAGGCAA 60 0 

CCCTACTACC ACCATCGAGA GCCCCAAGAG CACAAAAAGT GCAGACACCA CTACAGTTGC 660 

AACCTCCACA GCCACAGCTA AACCTAACAC CACAAGCAGC CAGAATGGAG CAGAAGATAC 72 0 

AACAAACTCT GGGGGGAAAA GCAGCCACAG TGTGACCACA GACCTCACAT CCACTAAGGC 780 

AGAACATCTG ACGACCCCTC ACCCTACAAG TCCACTTAGC CCCCGACAAC CCACTTTGAC 840 

GCATCCTGTG GCCACCCCAA CAAGCTCGGG ACATGACCAT CTTATGAAAA TTTCAAGCAG 900 

TTCAAGCACT GTGGCTATCC CTGGCTACAC CTTCACAAGC CCGGGGATGA CCACCACCCT 960 

ACCGTCATCG GTTATCTCGC AAAGAACTCA ACAGACCTCC AGTCAGATGC CAGCCAGCTC 102 0 

TACGGCCCCT TCCTCCCAGG AGACAGTGCA GCCCACGAGC CCGGCAACGG CATTGAGAAC 1080 

ACCTACCCTG CCAGAGACCA TGAGCTCCAG CCCCACAGCA GCATCAACTA CCCACCGATA 1140 

CCCCAAAACA CCTTCTCCCA CTGTGGCTCA TGAGAGTAAC TGGGCAAAGT GTGAGGATCT 12 00 

TGAGACACAG ACACAGAGTG AGAAGCAGCT CGTCCTGAAC CTCACAGGAA ACACCCTCTG 1260 

TGCAGGGGGC GCTTCGGATG AGAAATTGAT CTCACTGATA TGCCGAGCAG TCAAAGCCAC 1320 

CTTCAACCCG GCCCAAGATA AGTGCGGCAT ACGGCTGGCA TCTGTTCCAG GAAGTCAGAC 13 80 

CGTGGTCGTC AAAGAAATCA CTATTCACAC TAAGCTCCCT GCCAAGGATG TGTACGAGCG 144 0 

GCTGAAGGAC AAATGGGATG AACTAAAGGA GGCAGGGGTC AGTGACATGA AGCTAGGGGA 1500 

CCAGGGGCCA CCGGAGGAGG CCGAGGACCG CTTCAGCATG CCCCTCATCA TCACCATCGT 1560 

CTGCATGGCG TCATTCCTGC TCCTCGTGGC GGCCCTCTAT GGCTGCTGCC ACCAGCGCCT 1620 

CTCCCAGAGG AAGGACCAGC AGCGGCTAAC AGAGGAGCTG CAGACAGTGG AGAATGGTTA 16 8 0 

CCATGACAAC CCAACACTGG AAGTGATGGA GACCTCTTCT GAGATGCAGG AGAAGAAGGT 1740 

GGTCAGCCTC AACGGGGAGC TGGGGGACAG CTGGATCGTC CCTCTGGACA ACCTGACCAA 1800 

GGACGACCTG GATGAGGAGG AAGACACACA CCTCTAGTCC GGTCTGCCGG TGGCCTCCAG 1860 

CAGCACCACA GAGCTCCAGA CCAACCACCC CAAGTGCCGT TXGGATGGGG AAGGGAAAGA 192 0 

CTGGGGAGGG AGAGTGAACT CCGAGGGGTG TCCCCTCCCA ATCCCCCCAG GGCCTTAATT 1980 

TTTCCCTTTT CAACCTGAAC AAATCACATT CTGTCCAGAT TCCTCTTGTA AAATAACCCA 2040 

CTAGTGCCTG AGCTCAGTGC TGCTGGATGA TGAGGGAGAT CAAGAAAAAG CCACGTAAGG 2100 

GACTTTATAG ATGAACTAGT GGAATCCCTT CATTCTGCAG TGAGATTGCC GAGACCTGAA 2160 

GAGGGTAAGT GACTTGCCCA AGGTCAGAGC CACTTGGTGA CAGAGCCAGG ATGAGAACAA 222 0 
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AGATTCCATT TGCACCATGC CACACTGCTG TGTTCACATG TGCCTTCCGT CCAGAGCAGT 228 0 

CCCGGGCAGG GGTGAAACTC CAGCAGGTGG CTGGGCTGGA AAGGAGGGCA GGGCTACATC 234 0 

CTGGCTCGGT GGGATCTGAC GACCTGAAAG TCCAGCTCCC AAGTTTTCCT TCTCCTACCC 2400 

CAGCCTCGTG TACCCATCTT CCCACCCTCT ATGTTCTTAC CCCTCCCTAC ACTCAGTGTT 2460 

5 TGTTCCCACT TACTCTGTCC TGGGGCCTCT GGGATTAGCA CAGGTTATTC ATAACCTTGA 252 0 

ACCCCTTGTT CTGGATTCGG ATTTTCTCAC ATTTGCTTCG TGAGATGGGG GCTTAACCCA 2580 

CACAGGTCTC CGTGCGTGAA CCAGGTCTGC TTAGGGGACC TGCGTGCAGG TGAGGAGAGA 2640 

AGGGGACACT CGAGTCCAGG CTGGTATCTC AGGGCAGCTG ATGAGGGGTC AGCAGGAACA 2700 

CTGGCCCATT GCCCCTGGCA CTCCTTGCAG AGGCCACCCA CGATCTTCTT TGGGCTTCCA 2760 

10 TTTCCACCAG GGACTAAAAT CTGCTGTAGC TAGTGAGAGC AGCGTGTTCC TTTTGTTGTT 2820 

CACTGCTCAG CTGATGGGAG TGATTCCCTG AGACCCAGTA TGAAAGAGCA GTGGCTGCAG 2880 

GAGAGGCCTT CCCGGGGCCC CCCATCAGCG ATGTGTCTTC AGAGACAATC CATTAAAGCA 2940 

GCCAGGAAGG ACAGGCTTTC CCCTGTATAT CATAGGAAAC TCAGGGACAT TTCAAGTTGC 300 0 

TGAGAGTTTT GTTATAGTTG TTTTCTAACC CAGCCCTCCA CTGCCAAAGG CCAAAAGCTC 3060 

15 AGACAGTTGG CAGACGTCCA GTTAGCTCAT CTCACTCACT CTGATTCTCC TGTGCCACAG 312 0 

GAAAAGAGGG CCTGGAAAGC GCAGTGCATG CTGGGTGCAT GAAGGGCAGC CTGGGGGACA 3180 

GACTGTTGTG GGAACGTCCC ACTGTCCTGG CCTGGAGCTA GGCCTTGCTG TTCCTCTTCT 3240 

CTGTGAGCCT AGTGGGGCTG CTGCGGTTCT CTTGCAGTTT CTGGTGGCAT CTCAGGGGAA 33 00 

CACAAAAGCT ATGTCTATTC CCCAATATAG GACTTTTATG GGCTCGGCAG TTAGCTGCCA 3360 

20 TGTAGAAGGC TCCTAAGCAG TGGGCATGGT GAGGTTTCAT CTGATTGAGA AGGGGGAATC 3420 

CTGTGTGGAA TGTTGAACTT TCGCCATGGT CTCCATCGTT CTGGGCGTAA ATTCCCTGGG 3480 

ATCAAGTAGG AAAATGGGCA GAACTGCTTA GGGGAATGAA ATTGCCATTT TTCGGGTGAA 3540 

ACGCCACACC TCCAGGGTCT TAAGAGT C AG GCTCCGGCTG TAGTAGCTCT GATGAAATAG 360 0 

GCTATCCACT CGGGATGGCT TACTTTTTAA AAGGGTAGGG GGAGGGGCTG GGGAAGATCT 366 0 

25 GTCCTGCACC ATCTGCCTAA TTCCTTCCTC ACAGTCTGTA GCCATCTGAT ATCCTAGGGG 372 0 

GAAAAGGAAG GCCAGGGGTT CACATAGGGC CCCAGCGAGT TTCCCAGGAG TTAGAGGGAT 378 0 

GCGAGGCTAA CAAGTTCCAA AAACATCTGC CCCGATGCTC TAGTGTTTGG AGGTGGGCAG 384 0 

GATGGAGAAC AGTGCCTGTT TGGGGGAAAA CAGGAAATCT TGTTAGGCTT GAGTGAGGTG 3900 

TTTGCTTCCT TCTTGCCCAG CGCTGGGTTC TCTCCACCCA GTAGGTTTTC TGTTGTGGTC 3960 

30 CCGTGGGAGA GGCCAGACTG GATTATTCCT CCTTTGCTGA TCCTGGGTCA CACTTCACCA 4020 

GCCAGGGCTT TTGACGGAGA CAGCAAATAG GCCTCTGCAA ATCAATCAAA GGCTGCAACC 4080 

CTATGGCGTC TTGGAGACAG ATGATGACTG GCAAGGACTA GAGAGCAGGA GTGCCTGGCC 4140 

AGGTCGGTCC TGACTCTCCT GACTCTCCAT CGCTCTGTCC AAGGAGAACC CGGAGAGGCT 42 00 

CTGGGCTGAT TCAGAGGTTA CTGCTTTATA TTCGTCCAAA CTGTGTTAGT CTAGGCTTAG 4260 

35 GACAGCTTCA GAATCTGACA CCTTGCCTTG CTCTTGCCAC CAGGACACCT ATGTCAACAG 432 0 

GCCAAACAGC CATGCATCTA TAAAGGTCAT CATCTTCTGC CACCTTTACT GGGTTCTAAA 43 8 0 

TGCTCTCTGA TAATTCAGAG AGCATTGGGT CTGGGAAGAG GTAAGAGGAA CACTAGAAGC 444 0 

TCAGCATGAC TTAAACAGGT TGTAGCAAAG ACAGTTTATC ATCAACTCTT TCAGTGGTAA 4500 

ACTGTGGTTT CCCCAAGCTG CACAGGAGGC CAGAAACCAC AAGTATGATG ACTAGGAAGC 4560 

40 CTACTGTCAT GAGAGTGGGG AGACAGGCAG CAAAGCTTAT GAAGGAGGTA CAGAATATTC 4620 

TTTGCGTTGT AAGACAGAAT ACGGGTTTAA TCTAGTCTAG GCRCCAGATT TTITTCCCGC 4680 

TTGATAAGGA AAGCTAGCAG AAAGTTTATT TAAACCACTT CTTGAGCTTT ATCTTTTTTG 4740 

ACAATATACT GGAGAAACTT TGAAGAACAA GTTCAAACTG ATACATATAC ACATATTTTT 4800 

TTGATAATGT AAATACAGTG ACCATGTTAA CCTACCCTGC ACTGCTTTAA GTGAACATAC 4860 

45 TTTGAAAAAG CATTATGTTA GCTGAGTGAT GGCCAAGTTT TTTCTCTGGA CAGGAATGTA 492 0 

AATGTCTTAC TGGAAATGAC AAGTTTTTGC TTGATTTTTT TTTTTAAACA AAAAATGAAA 4980 

TATAACAAGA CAAACTTATG ATAAAGTATT TGTCTTGTAG ATCAGGTGTT TTGTTTTGTT 504 0 

TTTTTAATTT TAAAATGCAA CCCTGCCCCC TCCCCAGCAA AGTCACAGCT CCATTTCAGT 5100 

AAAGGTTGGA GTCAATATGC TCTGGTTGGC AGGCAACCCT GTAGTCATGG AGAAAGGTAT 5160 

50 TTCAAGATCT AGTCCAATCT TTTTCTAGAG AAAAAGATAA TCTGAAGCTC ACAAAGATGA 5220 

AGTGACTTCC TCAAAATCAC ATGGTTCAGG ACAGAAACAA GATTAAAACC TGGATCCACA 52 80 

GACTGTGCGC CTCAGAAGGA ATAATCGGTA AATTAAGAAT TGCTACTCGA AGGTGCCAGA 534 0 

ATGACACAAA GGACAGAATT CCTTTCCCAG TTGTTACCCT AGCAAGGCTA GGGAGGGCAT 5400 

GAACACAAAC ATAAGAACTG GTCTTCTCAC ACTTTCTCTG AATCATTTAG GTTTAAGATG 54 60 

55 TAAGTGAACA ATTCTTTCTT TCTGCCAAGA AACAAAGTTT TGGATGAGCT TTTATATATG 5520 

GAACTTACTC CAACAGGACT GAGGGACCAA GGAAACATGA TGGGGGAGGC AAGAGAGGGC 5580 

AAAGAGTAAA ACTGTAGCAT AGCTTTTGTC ACGGTCACTA GCTGATCCCT CAGGTCTGCT 5640 

GCAAACACAG CATGGAGGAC ACAGATGACT CTTTGGTGTT GGTCTTTTTG TCTGCAGTGA 5700 

ATGTTCAACA GTTTGCCCAG GAACTGGGGG ATCATATATG TCTTAGTGGA CAGGGGTCTG 5760 

60 AAGTACACTG GAATTTACTG AGAAACTTGT TTGTAAAAAC TATAGTTAAT AATTATTGCA 5 82 0 
TTTTCTTACA AAAATATATT TTGGAAAATT GTATACTGTC AATTAAAGT 

Seq ID NO: 14 Protein se quence: 
Protein Accession #: NP 005388 

65 

1 11 21 31 41 51 

I I I I I I 

MRCALALSAL LLLLSTPPLL PSSPSPSPSP SPSQNATQTT TDSSNKTAPT PASSVTIMAT 60 

DTAQQSTVPT SKANSILASV KATTLGVSSD SPGTTTLAQQ VSGPVNTTVA RGGGSGNPTT 120 

70 TIESPKSTKS ADTTTVATST ATAKPNTTSS QNGAEDTTNS GGKSSHSVTT DLTSTKAEHL 18 0 

TTPHPTSPLS PRQPTLTHPV ATPTSSGHDH LMKISSSSST VAIPGYTFTS PGMTTTLPSS 240 

VISQRTQQTS SQMPASSTAP SSQETVQPTS PATALRTPTL PHTMSSSPTA ASTTHRYPKT 300 

PSPTVAHESN WAKCEDLETQ TQSEKQLVLN LTGNTLCAGG ASDEKLISLI CRAVKATFNP 360 

AQDKCGIRLA SVPGSQTWV KEITIHTKLP AKDVYERLKD KWDELKEAGV SDMKLGDQGP 420 

75 PEEAEDRFSM PLIITIVCMA SFLLLVAALY GCCHQRLSQR KDQQRLTEEL QTVENGYHDN 480 
PTLEVMETSS EMQEKKWSL NGELGDSWIV PLDNLTKDDL DEEEDTHL 
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Seq ID NO: 15 Nucleotide sequence: 
Nucleic Acid Accession if: NM_004105 

Coding sequence: 150.. 1631 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CTAGTATTCT ACTAGAACTG GAAGATTGCT CTCCGAGTTT TTTTTTTGTT ATTTTGTTAA 60 

AAAATA^AAA GCTTGAGCAG CAATTCATAT TACTGTCACA GGTATTTTTG CTGTGCTGTG 120 

CAAGGTAACT CTGCTAGCTA AGATTCACAA TGTTGAAAGC CCTTTTCCTA ACTATGCTGA 180 

CTCTGGCGCT GGTCAAGTCA CAGGACACCG AAGAAACCAT CACGTACACG CAATGCACTG 240 

ACGGATATGA GTGGGATCCT GTGAGACAGC AATGCAAAGA TATTGATGAA TGTGACATTG 300 

TCCCAGACGC TTGTAAAGGT GGAATGAAGT GTGTCAACCA CTATGGAGGA TACCTCTGCC 360 

TTCCGAAAAC AGCCCAGATT ATTGTCAATA ATGAACAGCC TCAGCAGGAA ACACAACCAG 42 0 

CAGAAGGAAC CTCAGGGGCA ACCACCGGGG TTGTAGCTGC CAGCAGCATG GCAACCAGTG 480 

GAGTGTTGCC CGGGGGTGGT TTTGTGGCCA GTGCTGCTGC AGTCGCAGGC CCTGAAATGC 54 0 

AGACTGGCCG AAATAACTTT GTCATCCGGC GGAACCCAGC TGACCCTCAG CGCATTCCCT 600 

CCAACCCTTC CCACCGTATC CAGTGTGCAG CAGGCTACGA GCAAAGTGAA CACAACGTGT 660 

GCCAAGACAT AGACGAGTGC ACTGCAGGGA CGCACAACTG TAGAGCAGAC CAAGTGTGCA 720 

TCAATTTACG GGGATCCTTT GCATGTCAGT GCCCTCCTGG ATATCAGAAG CGAGGGGAGC 78 0 

AGTGCGTAGA CATAGATGAA TGTACCATCC CTCCATATTG CCACCAAAGA TGCGTGAATA 84 0 

CACCAGGCTC ATTTTATTGC CAGTGCAGTC CTGGGTTTCA ATTGGCAGCA AACAACTATA 900 

CCTGCGTAGA TATAAATGAA TGTGATGCCA GCAATCAATG TGCTCAGCAG TGCTACAACA 960 

TTCTTGGTTC ATTCATCTGT CAGTGCAATC AAGGATATGA GCTAAGCAGT GACAGGCTCA 1020 

ACTGTGAAGA CATTGATGAA TGCAGAACCT CAAGCTACCT GTGTCAATAT CAATGTGTCA 10 80 

ATGAACCTGG GAAATTCTCA TGTATGTGCC CCCAGGGATA CCAAGTGGTG AGAAGTAGAA 1140 

CATGTCAAGA TATAAATGAG TGTGAGACCA CAAATGAATG CCGGGAGGAT GAAATGTGTT 12 00 

GGAATTATCA TGGCGGCTTC CGTTGTTATC CACGAAATCC TTGTCAAGAT CCCTACATTC 1260 

TAACACCAGA GAACCGATGT GTTTGCCCAG TCTCAAATGC CATGTGCCGA GAACTGCCCC 1320 

AGTCAATAGT CTACAAATAC ATGAGCATCC GATCTGATAG GTCTGTGCCA TCAGACATCT 13 80 

TCCAGATACA GGCCACAACT ATTTATGCCA ACACCATCAA TACTTTTCGG ATTAAATCTG 1440 

GAAATGAAAA TGGAGAGTTC TACCTACGAC AAACAAGTCC TGTAAGTGCA ATGCTTGTGC 1500 

TCGTGAAGTC ATTATCAGGA CCAAGAGAAC ATATCGTGGA CCTGGAGATG CTGACAGTCA 1560 

GCAGTATAGG GACCTTCCGC ACAAGCTCTG TGTTAAGATT GACAATAATA GTGGGGCCAT 1620 

TTTCATTTTA GTCTTTTCTA AGAGTCAACC ACAGGCATTT AAGTCAGCCA AAGAATATTG 1680 

TTACCTTAAA GCACTATTTT ATTTATAGAT ATATCTAGTG CATCTACATC TCTATACTGT 1740 

ACACTCACCC ATAACAAACA ATTACACCAT GGTATAAAGT GGGCATTTAA TATGTAAAGA 1800 

TTCAAAGTTT GTCTTTATTA CTATATGTAA ATTAGACATT AATCCACTAA ACTGGTCTTC 1860 

TTCAAGAGAG CTAAGTATAC ACTATCTGGT GAAACTTGGA TTCTTTCCTA TAAAAGTGGG 1920 

ACCAAGCAAT GATGATCTTC TGTGGTGCTT AAGGAAACTT ACTAGAGCTC CACTAACAGT 1980 

CTCATAAGGA GGCAGCCATC ATAACCATTG AATAGCATGC AAGGGTAAGA ATGAGTTTTT 2040 

AACTGCTTTG TAAGAAAATG GAAAAGGTCA ATAAAGATAT ATTTCTTTAG AAAATGGGGA 2100 

TCTGCCATAT TTGTGTTGGT TTTTATTTTC ATATCCAGCC TAAAGGTGGT TGTTTATTAT 2160 

ATAGTAATAA ATCATTGCTG TACAACATGC TGGTTTCTGT AGGGTATTTT TAATTTTGTC 2220 

AGAAATTTTA GATTGTGAAT ATTTTGTAAA AAACAGTAAG CAAAATTTTC CAGAATTCCC 22 80 

AAAATGAACC AGATACCCCC TAGAAAATTA TACTATTGAG AAATCTATGG GGAGGATATG 2340 

AGAAAATAAA TTCCTTCTAA ACCACATTGG AACTGACCTG AAGAAGCAAA CTCGGAAAAT 24 00 

ATAATAACAT CCCTGAATTC AGGCATTCAC AAGATGCAGA ACAAAATGGA TAAAAGGTAT 24 60 

TTCACTGGAG AAGTTTTAAT TTCTAAGTAA AATTTAAATC CTAACACTTC ACTAATTTAT 2520 

AACTAAAATT TCTCATCTTC GTACTTGATG CTCACAGAGG AAGAAAATGA TGATGGTTTT 25 80 

TATTCCTGGC ATCCAGAGTG ACAGTGAACT TAAGCAAATT ACCCTCCTAC CCAATTCTAT 2640 

GGAATATTTT ATACGTCTCC TTGTTTAAAA TCTGACTGCT TTACTTTGAT GTATCATATT 2700 
TTTAAATAAA AATAAATATT CCTTTAGAAG ATCACTCTAA AA 



Seq ID NO: 16 Protein sequence: 
Protein Accession #: NP_004096 



1 11 21 31 41 51 

I I I I I I 

MLKALFLTML TLALVKSQDT EETITYTQCT DGYEWDPVRQ QCKDIDECDI VPDACKGGMK 60 

CVNHYGGYLC LPKTAQIIVN NEQPQQETQP AEGTSGATTG WAASSMATS GVLPGGGFVA 120 

SAAAVAGPEM QTGRNNFVIR RNPADPQRIP SNPSHRIQCA AGYEQSEENV CQDIDECTAG 180 

THNCRADQVC INLRGSFACQ CPPGYQKRGE QCVDIDECTI PPYCHQRCVN TPGSFYCQCS 240 

PGFQBAANNY TCVDINECDA SNQCAQQCYN ILGSFICQCN QGYBLSSDRL NCEDIDECRT 3 00 

SSYLCQYQCV NEPGKFSCMC PQGYQWRSR TCQDINECET TNECREDEMC WNYHGGFRCY 360 

PRNPCQDPYI LTPENRCVCP VSNAMCRELP QSIVYXYKSI RSDRSVPSDI FQIQATTIYA 420 

NTINTFRIKS GNEKfGEFYLR QTSPVSAMLV LVKSLSGPRE HIVDLEMLTV SSIGTFRTSS 480 
VLRLTIIVGP FSF 

Seq ID NO: 17 Nucleotide sequence: 
Nucleic Acid Accession #: NM_018894 

Coding sequence: 27.. 1967 (underlined sequences correspond to start and stop c 
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I I I I I I 

AAAACATTCA ACAAATTAAT GGGTGTAAGG AACTGGAAAA CCTGGACTCC TACCACATGC 60 

5 AGATAAAACC AATAGAGTGC AGAATAAGAC TCAAGTCAAG TAAGTAACGT TAAACACCAT 120 

AAAGACACAT GGCCTTCTTT GTGTACATGA CATGCATTCT CAACAATGCA CTGACGGATA 180 

TGAGTGGGAT CCTGTGAGAC AGCAATGCAA AGATATTGAT GAATGTGACA TTGTCCCAGA 24 0 

CGCTTGTAAA GGTGGAATGA AGTGTGTCAA CCACTATGGA GGATACCTCT GCCTTCCGAA 300 

AACAGCCCAG ATTATTGTCA ATAATGAACA GCCTCAGCAG GAAACACAAC CAGCAGAAGG 360 

10 AACCTCAGGG GCAACCACCG GGGTTGTAGC TGCCAGCAGC ATGGCAACCA GTGGAGTGTT 42 0 

GCCCGGGGGT GGTTTTGTGG CCAGTGCTGC TGCAGTCGCA GGCCCTGAAA TGCAGACTGG 480 

CCGAAATAAC TTTGTCATCC GGCGGAACCC AGCTGACCCT CAGCGCATTC CCTCCAACCC 540 

TTCCCACCGT ATCCAGTGTG CAGCAGGCTA CGAGCAAAGT GAACACAACG TGTGCCAAGA 600 

CATAGACGAG TGCACTGCAG GGACGCACAA CTGTAGAGCA GACCAAGTGT GCATCAATTT 660 

15 ACGGGGATCC TTTGCATGTC AGTGCCCTCC TGGATATCAG AAGCGAGGGG AGCAGTGCGT 720 

AGACATAGAT GAATGTACCA TCCCTCCATA TTGCCACCAA AGATGCGTGA ATACACCAGG 780 

CTCATTTTAT TGCCAGTGCA GTCCTGGGTT TCAATTGGCA GCAAACAACT ATACCTGCGT 840 

AGATATAAAT GAATGTGATG CCAGCAATCA ATGTGCTCAG CAGTGCTACA ACATTCTTGG 900 

TTCATTCATC TGTCAGTGCA ATCAAGGATA TGAGCTAAGC AGTGACAG3C TCAACTGTGA 960 

20 AGACATTGAT GAATGCAGAA CCTCAAGCTA CCTGTGTCAA TATCAATGTG TCAATGAACC 1020 

TGGGAAATTC TCATGTATGT GCCCCCAGGG ATACCAAGTG GTGAGAAGTA GAACATGTCA 1080 

AGATATAAAT GAGTGTGAGA CCACAAATGA ATGCCGGGAG GATGAAATGT GTTGGAATTA 1140 

TCATGGCGGC TTCCGTTGTT ATCCACGAAA TCCTTGTCAA GATCCCTACA TTCTAACACC 1200 

AGAGAACCGA TGTGTTTGCC CAGTCTCAAA TGCCATGTGC CGAGAACTGC CCCAGTCAAT 1260 

25 AGTCTACAAA TACATGAGCA TCCGATCTGA TAGGTCTGTG CCATCAGACA TCTTCCAGAT 1320 

ACAGGCCACA ACTATTTATG CCAACACCAT CAATACTTTT CGGATTAAAT CTGGAAATGA 1380 

AAATGGAGAG TTCTACCTAC GACAAACAAG TCCTGTAAGT GCAATGCTTG TGCTCGTGAA 1440 

GTCATTATCA GGACCAAGAG AACATATCGT GGACCTGGAG ATGCTGACAG TCAGCAGTAT 1500 

AGGGACCTTC CGCACAAGCT CTGTGTTAAG ATTGACAATA ATAGTGGGGC CATTTTCATT 1560 

30 TTAGTCTTTT CTAAGAGTCA ACCACAGGCA TTTAAGTCAG CCAAAGAATA TTGTTACCTT 162 0 

AAAGCACTAT TTTATTTATA GATATATCTA GTGCATCTAC ATCTCTATAC TGTACACTCA 168 0 

CCCATAACAA ACAATTACAC CATGGTATAA AGTGGGCATT TAATATGTAA AGATTCAAAG 174 0 

TTTGTCTTTA TTACTATATG TAAATTAGAC ATTAATCCAC TAAACTGGTC TTCTTCAAGA 1800 

GAGCTAAGTA TACACTATCT GGTGAAACTT GGATTCTTTC CTATAAAAGT GGGACCAAGC 1860 

35 AATGATGATC TTCTGTGGTG CTTAAGGAAA CTTACTAGAG CTCCACTAAC AGTCTCATAA 192 0 

GGAGGCAGCC ATCATAACCA TTGAATAGCA TGCAAGGGTA AGAATGAGTT TTTAACTGCT 198 0 

TTGTAAGAAA ATGGAAAAGG TCAATAAAGA TATATTTCTT TAGAAAATGG GGATCTGCCA 204 0 

TATTTGTGTT GGTTTTTATT TTCATATCCA GCCTAAAGGT GGTTGTTTAT TATATAGTAA 2100 

TAAATCATTG CTGTACAACA TGCTGGTTTC TGTAGGG7AT TTTTAATTTT GTCAGAAATT 2160 

40 TTAGATTGTG AATATTTTGT AAAAAACAGT AAGCAAAATT TTCCAGAATT CCCAAAATGA 2220 

ACCAGATACC CCCTAGAAAA TTATACTATT GAGAAATCTA TGGGGAGGAT ATGAGAAAAT 2280 

AAATTCCTTC TAAACCACAT TGGAACTGAC CTGAAGAAGC AAACTCGGAA AATATAATAA 2340 

CATCCCTGAA TTCAGGCATT CACAAGATGC AGAACAAAAT GGATAAAAGG TATTTCACTG 2400 

GAGAAGTTTT AATTTCTAAG TAAAATTTAA ATCCTAACAC TTCACTAATT TATAACTAAA 2460 

45 ATTTCTCATC TTCGTACTTG ATGCTCACAG AGGAAGAAAA TGATGATGGT TTTTATTCCT 2520 

GGCATCCAGA GTGACAGTGA ACTTAAGCAA ATTACCCTCC TACCCAATTC TATGGAATAT 2580 

TTTATACGTC TCCTTGTTTA AAATCTGACT GCTTTACTTT GATGTATCAT ATTTTTAAAT 2640 
AAAAATAAAT ATTCCTTTAG AAGATCACTC TAAAA 

50 Seq ID NO: 18 Protein sequence; 

Protein Accession # : NP_061489.1 

1 11 21 31 41 51 

55 ilHSQQCTDGY KWDPVRQQCK DIDECDIVPD ACKGGMKCVN HYGGYLCLPK TAQIIVNNEQ 60 

PQQETQPAEG TSGATTGWA ASSMATSGVL PGGGFVASAA AVAGPEMQTG RNNFVIRRNP 120 

ADPQRIPSNP SHRIQCAAGY EQSEHNVCQD IDECTAGTHN CRADQVCINL RGSFACQCPP 180 

GYQKRGEQCV DIDECTIPPY CHQRCVNTPG SFYCQCSPGF QLAANNYTCV DINECDASNQ 240 

CAQQCYNILG SFICQCNQGY ELSSDRLNCE DIDECRTSSY LCQYQCVNEP GKFSCMCPQG 3 00 

60 YQWRSRTCQ DINECETTNE CREDEMCWNY HGGFRCYPRN PCQDPYILTP ENRCVCPVSN 360 

AMCRELPQSI VYKYMSIRSD RSVPSDIFQI QATTIYANTI NTFRIKSGNE KGEFYLRQTS 420 
PVSAMLVLVK SLSGPREHIV DLEMLTVSSI GTFRTSSVLR LTIIVGPFSF 

Seq ID NO : 19 Nucleotide sequence: 
65 Nucleic Acid Accession #: NM_006500 

Coding sequence: 27.. 1967 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

70 | | | | | I 

ACTTGCGTCT CGCCCTCCGG CCAAGCATGG GGCTTCCCAG GCTGGTCTGC GCCTTCTTGC 6 0 

TCGCCGCCTG CTGCTGCTGT CCTCGCGTCG CGGGTGTGCC CGGAGAGGCT GAGCAGCCTG 12 0 

CGCCTGAGCT GGTGGAGGTG GAAGTGGGCA GCACAGCCCT TCTGAAGTC-C GGCCTCTCCC 180 

AGTCCCAAGG CAACCTCAGC CATGTCGACT GGTTTTCTGT CCACAAGGAG AAGCGGACGC 240 

75 TCATCTTCCG TGTGCGCCAG GGCCAGGGCC AGAGCGAACC TGGGGAGTAC GAGCAGCGGC 3 00 

TCAGCCTCCA GGACAGAGGG GCTACTCTGG CCCTGACTCA AGTCACCCCC CAAGACGAGC 360 
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GCATCTTCTT GTGCCAGGGC AAGCGCCCTC GGTCCCAGGA GTACCGCATC CAGCTCCGCG 420 

TCTACAAAGC TCCGGAGGAG CCAAACATCC AGGTCAACCC CCTGG3CATC CCTGTGAACA 43 0 

GTAAGGAGCC TGAGGAGGTC GCTACCTGTG TAGGGAGGAA CGGGTACCCC ATTCCTCAAG 54 0 

TCATCTGGTA CAAGAATGGC CGGCCTCTGA AGGAGGAGAA GAACCGGGTC CACATTCAGT 600 

5 CGTCCCAGAC TGTGGAGTCG AGTGGTTTGT ACACCTTGCA GAGTATTCTG AAGGCACAGC 660 

TGGTTAAAGA AGACAAAGAT GCCCAGTTTT ACTGTGAGCT CAACTACCGG CTGCCCAGTG 720 

GGAACCACAT GAAGGAGTCC AGGGAAGTCA CCGTCCCTGT TTTCTACCCG ACAGAAAAAG 7S0 

TGTGGCTGGA AGTGGAGCCC GTGGGAATGC TGAAGGAAGG GGACCGCGTG GAAATCAGGT 84 0 

GTTTGGCTGA TGGCAACCCT CCACCACACT TCAGCATCAG CAAGCAGAAC CCCAGCACCA 900 

10 GGGAGGCAGA GGAAGAGACA ACCAACGACA ACGGGGTCCT GGTGCTGGAG CCTGCCCGGA 960 

AGGAACACAG TGGGCGCTAT GAATGTCAGG CCTGGAACTT GGACACCATG ATATCGCTGC 1020 

TGAGTGAACC ACAGGAACTA CTGGTGAACT ATGTGTCTGA CGTCCGAGTG AGTCCCGCAG 108 0 

CCCCTGAGAG ACAGGAAGGC AGCAGCCTCA CCCTGACCTG TGAGGCAGAG AGTAGCCAGG 1140 

ACCTCGAGTT CCAGTGGCTG AGAGAAGAGA CAGACCAGGT GCTGGAAAGG GGGCCTGTGC 1200 

15 TTCAGTTGCA TGACCTGAAA CGGGAGGCAG GAGGCGGCTA TCGCTGCGTG GCGTCTGTGC 126 0 

CCAGCATACC CGGCCTGAAC CGCACACAGC TGGTCAAGCT GGCCATTTTT GGCCCCCCTT 132 0 

GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTGTCTT 1380 

GTGAAGCGTC AGGGCACCCC CGGCCCACCA TCTCCTGGAA CGTCAACGGC ACGGCAAGTG 1440 

AACAAGACCA AGATCCACAG CGAGTCCTGA GCACCCTGAA TGTCCTCGTG ACCCCGGAGC 150 0 

20 TGTTGGAGAC AGGTGTTGAA TGCACGGCCT CCAACGACCT GGGCAAAAAC ACCAGCATCC 1560 

TCTTCCTGGA GCTGGTCAAT TTAACCACCC TCACACCAGA CTCCAACACA ACCACTGGCC 1620 

TCAGCACTTC CACTGCCAGT CCTCATACCA GAGCCAACAG CACCTCCACA GAGAGAAAGC 1680 

TGCCGGAGCC GGAGAGCCGG GGCGTGGTCA TCGTGGCTGT 3ATTGTGTGC ATCCTGGTCC 1740 

TGGCGGTGCT GGGCGCTGTC CTCTATTTCC TCTATAAGAA GGGCAAGCTG CCGTGCAGGC 1B00 

25 GCTCAGGGAA GCAGGAGATC ACGCTGCCCC CGTCTCGTAA GACCGAACTT GTAGTTGAAG 1860 

TTAAGTCAGA TAAGCTCCCA GAAGAGATGG GCCTCCTGCA GGGCAGCAGC GGTGACAAGA 1920 

GGGCTCCGGG AGACCAGGGA GAGAAATACA TCGATCTGAG GCATTAGCCC CGAATCACTT 1980 

CAGCTCCCTT CCCTGCCTGG ACCATTCCCA GCTCCCTGCT CACTCTTCTC TCAGCCAAAG 2040 

CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG 2100 

30 GGCCACTGGG TTAGGACCTG AGGACCTCAC TTGGCCCTGC AAGCCGCTTT TCAGGGACCA 2160 

GTCCACCACC ATCTCCTCCA CGTTGAGTGA AGCTCATCCC AAGCAAGGAG CCCCAGTCTC 2220 

CCGAGCGGGT AGGAGAGTTT CTTGCAGAAC GTGTTTTTTC TTTACACACA TTATGGCTGT 2280 

AAATACCTGG CTCCTGCCAG CAGCTGAGCT GGGTAGCCTC TCTGAGCTGG TTTCCTGCCC 2340 

CAAAGGCTGG CTTCCACCAT CCAGGTGCAC CACTGAAGTG AGGACACACC GGAGCCAGGC 24 00 

35 GCCTGCTCAT GTTGAAGTGC GCTGTTCACA CCCGCTCCGG AGAGCACCCC AGCGGCATCC 2460 

AGAAGCAGCT GCAGTGTTGC TGCCACCACC CTCCTGCTCG CCTCTTCAAA GTCTCCTGTG 2520 

ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGTGCCGG 2580 

GGCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA GGCGGGCGGA 2640 

TCACAAAGTC AGGACGAGAC CATCCTGGCT AACACGGTGA AACCCTGTCT CTACTAAAAA 2700 

40 TACAAAAAAA AATTAGCTAG GCGTAGTGGT TGGCACCTAT AGTCCCAGCT ACTCGGAAGG 2760 

CTGAAGCAGG AGAATGGTAT GAATCCAGGA GGTGGAGCTT GCAGTGAGCC GAGACCGTGC 2 820 

CACTGCACTC CAGCCTGGGC ' AACACAGCGA GACTCCGTCT CGAGGAAAAA AAAAGAAAAG 2 880 

ACGCGTACCT GCGGTGAGGA AGCTGGGCGC TGTTTTCGAG TTCAGGTGAA TTAGCCTCAA 2940 

TCCCCGTGTT CACTTGCTCC CATAGCCCTC TTGATGGATC ACGTAAAACT GAAAGGCAGC 3 000 

45 GGGGAGCAGA CAAAGATGAG GTCTACACTG TCCTTCATGG GGATTAAAGC TATGGTTATA 3 060 

TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC CCTAGAAGGG CCCAAATGAG 3120 

AGAATGGTAC TTAGGGATGG AAAACGGGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTGT 3180 

CTGTGTGTAT GCATACATAT GTGTGTATAT ATGGTTTTGT CAGGTGTGTA AATTTGCAAA 3240 

TTGTTTCCTT TATATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT 3300 

50 AAAGCTTAAT TGTCCCAGAA AATCATACAT TGCTTTTTTA TTCTACATGG GTACCACAGG 3360 

AACCTGGGGG CCTGTGAAAC TACAACCAAA AGGCACACAA AACCGTTTCC AGTTGGCAGC 3420 

AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCA AGCTCTACCA GAGCAGACAG 3480 

CTACCCTACT TTTCAGCAGC AAAACGTCCC GTATGACGCA GCACGAAGGG CCTGGCAGGC 3540 
TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTCCGTCCA CTT 

55 

Protein Accession #: NP_006491 

60 

1 11 21 31 41 51 

I I I I I I 

MGLPRLVCAF LLAACCCCPR VAGVPGEAEQ PAPELVEVEV GSTALLKCGL SQSQGNLSHV 60 

DWFSVHKEKR TLIFRVRQGQ GQSEPGEYEQ RLSLQ3RGAT LALTQVTPQD ERIFLCQGKR 120 

65 PRSQEYRIQL RVYKAPEEPN IQVNPLGIPV NSKEPEEVAT CVGRNGYPIP QVIWYKNGRP 180 

LKEEKNRVHI QSSQTVESSG LYTLQSILKA QLVKEDKDAQ FYCELNYRLP SGNHMKESRE 240 

VTVPVFYPTE KVWLEVEPVG MLKEGDRVEI RCLADGNPPP HFSISKQNPS TREASEETTN 300 

DNGVLVLEPA RKEHSGRYEC QAWNLDTMIS LLSEPQELLV NYVSDVRVSP AAPERQEGSS 360 

LTLTCEAESS QDLEFQWLRE ETDQVLERGP VLQLHDLKRE AGGGYRCVAS VPSIPGLNRT 420 

70 QLVKLAIFGP PWMAFKERKV WVKENMVLNL SCEASGHPRP TISWNVNGTA SEQDQDPQRV 480 

LSTLNVLVTP ELLETGVECT ASNDLGKNTS ILFLELWLT TLTPDSNTTT GLSTSTASPH 540 

TRANSTSTER KLPEPESRGV VIVAVIVCIL VLAVLGAVLY FLYKKGKLPC RRSGKQEITL 600 
PPSRKTELW EVKSDKLPEE MGLLQGSSGD KRAPGDQGEK YIDLRH 
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Seq ID NO: 21 Nucleotide sequence : 
Nucleic Acid Accession #: NM_002421 
Coding sequence: 72.. 1481 (underlined sequences c 



rrespond to start and stop codons) 



GGGATATTGG 


AGTAGCAAGA 


GGCTGGGAAG 


CCATCACTTA 


CCTTGCACTG 




60 


CAAAGGCCAG 


TATGCACAGC 


TTTCCTCCAC 


TGCTGCTGCT 


GCTGTTCTGG 


GGTGTGGTGT 


120 


CTCACAGCTT 


CCCAGCGACT 


CTAGAAACAC 


AAGAGCAAGA 


TGTGGACTTA 


GTCCAGAAAT 


180 


ACCTGGAAAA 


ATACTACAAC 




ATGGGAGGCA 


AGTTGAAAAG 


CGGAGAAATA 




GTGGCCCAGT 


GGTTGAAAAA 


TTGAAGCAAA 


TGCAGGAATT 


CTTTGGGCTG 


AAAGTGACTG 


300 


GG AAAC C AG A 


TGCTGAAACC 


CTGAAGGTGA 


TGAAGCAGCC 


CAGATGTGGA 


GTGCCTGATG 


360 


TGGCTCAGTT 


TGTCCTCACT 


GAGGGGAACC 


CTCGCTGGGA 


GCAAACACAT 


CTGACCTACA 


420 


GGATTGAAAA 


TTACACGCCA 


GATTTGCCAA 


GAGCAGATGT 


GGACCATGCC 


ATTGAGAAAG 


480 


CCTTCCAACT 


CTGGAGTAAT 


GTCACACCTC 


TGACATTCAC 


CAAGGTCTCT 


GAGGGTCAAG 


540 


CAGACATCAT 


GATATCTTTT 


GTCAGGGGAG 


ATCATCGGGA 


CAACTCTCCT 


TTTGATGGAC 


600 


CTGGAGGAAA 


TCTTGCTCAT 


GCTTTTCAAC 


CAGGCCCAGG 


TA'TTGGAGGG 


GATGCTCATT 


660 


TTGATGAAGA 


TGAAAGGTGG 


ACCAACAATT 


TCAGAGAGTA 


CAACTTACAT 


CGTGTTGCGG 


720 


CTCATGAACT 


CGGCCATTCT 


CTTGGACTCT 


CCCATTCTAC 


TGATATCGGG 


GCTTTGATGT 




ACCCTAGCTA 


CACCTTCAGT 


GGTGATGTTC 


AGCTAGCTCA 


GGATGACATT 


GATGGCATCC 


840 


AAGCCATATA 


TGGACGTTCC 


CAAAATCCTG 


TCCAGCCCAT 


CGGCCCACAA 


ACCCCAAAAG 




CGTGTGACAG 


TAAGCTAACC 


TTTGATGCTA 


TAACTACGAT 


TCGGGGAGAA 


GTGATGTTCT 


960 




ATTCTACATG 


CGCACAAATC 


CCTTCTACCC 


GGAAGTTGAG 


CTCAATTTCA 


1020 


TTTCTGTTTT 


CTGGCCACAA 


CTGCCAAATG 


GGCTTGAAGC 


TGCTTACGAA 


TTTGCCGACA 


1080 


GAGATGAAGT 


CCGGTTTTTC 


AAAGGGAATA 


AGTACTGGGC 


TGTTCAGGGA 


CAGAATGTGC 


1140 




CCCCAAGGAC 


ATCTACAGCT 


CCTTTGGCTT 




GTGAAGCATA 


1200 




TCTTTCTGAG 


GAAAACACTG 






GCTAACAAAT 


1260 


ACTGGAGGTA 


TGATGAATAT 


AAACGATCTA 


TGGATCCAGG 


TTATCCCAAA 


ATGATAGCAC 


1320 


ATGACTTTCC 


TGGAATTGGC 


CACAAAGTTG 


ATGCAGTTTT 


CATGAAAGAT 


GGATTTTTCT 


1380 


ATTTCTTTCA 


TGGAACAAGA 


CAATACAAAT 


TTGATCCTAA 


AACGAAGAGA 


ATTTTGACTC 


1440 


TCCAGAAAGC 


TAATAGCTGG 


TTCAACTGCA 


GGAAAAATTG 


AACATTACTA 


ATTTGAATGG 


1500 


AAAACACATG 


GTGTGAGTCC 


AAAGAAGGTG 


TTTTCCTGAA 


GAACTGTCTA 


TTTTCTCAGT 


1560 


CATTTTTAAC 


CTCTAGAGTC 


ACTGATACAC 


AGAATATAAT 


CTTATTTATA 


CCTCAGTTTG 


1620 


CATATTTTTT 


TACTATTTAG 


AATGTAGCCC 


TTTTTGTACT 


GATATAATTT 


AGTTCCACAA 


1680 


ATGGTGGGTA 


CAAAAAGTCA 


AGTTTGTGGC 


TTATGGATTC 


ATATAGGCCA 


GAGTTGCAAA 




GATCTTTTCC 


AGAGTATGCA 


ACTCTGACGT 


TGATCCCAGA 




AGTGACAAAC 


1800 


ATATCCTTTC 


AAGACAGAAA 


GAGACAGGAG 


ACATGAGTCT 


TTGCCGGAGG 


AAAAGCAGCT 




CAAGAACACA 


TGTGCAGTCA 


CTGGTGTCAC 


CCTGGATAGG 


CAAGGGATAA 


CTCTTCTAAC 


1920 


ACAAAATAAG 


TGTTTTATGT 




GTCAACCTTG 


TTTCTACTGT 
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41 



51 
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MHSFPPLLLL LFWGWSHSF PATLETQEQD VDLVQKYLEK YYNLKNDGRQ VEKRRNSGPV 

VEKLKQMQEF FGLKVTGKPD AETLKVMKQP RCGVPDVAQF VLTEGNPRWE QTHLTYRIEN 
YTPDLPRADV DHAIEKAFQL WSNVTPLTFT KVSEGQADIM ISFVRGDHRD NSPFDGPGGN 

LAHAFQPGPG IGGDAHFDED ERWTNNFREY NLHRVAAHEL GHSLGLSHST DIGALMYPSY 

TFSGDVQLAQ DDIDGIQAIY GRSQNPVQPI GPQTPKACDS KLTFDAITTI RGEVMFFKDR 

FYMRTNPFYP EVELNFISVF WPQLPNGLEA AYEFADRDEV RFFKGNKYWA VQGQNVLHGY 

PKDIYSSFGF PRTVKHIDAA LSEENTGKTY FFVANKYWRY DEYKRSMDPG YPKMIAHDFP 

GIGHKVDAVF MKDGFFYFFH GTRQYKFDPK TKRILTLQKA NSWFNCRKN 



acleotide sequence: 



TCTGCGTGTG 
AGGCAAACAG 
CTCTACAGGC 
TCCGCGAGTT 



CAGTTCTCAT 



ACTTCATGTA 
CCGCCACCTA 
GCTATGAACC 
CGGCCCCTCC 



11 
I 

CCGGGGCTAG 
AGGAGGGAAG 
CTGTGTCGCT 
CACTCGCCAC 
CACTGACGTC 
CGCCTGCAGT 
GCTCTCTCTG 
CACTTCGCGC 
TTTGCAGATG 
TCTGGGCATC 
ACCAGGTAGT 



GGGCTGGAAG 
GCGTCTTAGG 
ATGGGTTCCC 
TCCTCCGACG 
ACGCTGCTGG 
GGCTTCTTCT 
CCCGGGGGTC 
CTGCGCCTCT 
GAGCACGTGG 
TCCCTGCGCC 
CCCAGGCGCT 



31 41 51 

I I 
TCCTGGCTCT AGTTGCACCT CGGAAGGAAA 
ACTGCCTGGA TCCAGAGCAC 1 
CCGCCGCCCC G 

CCTCAACGAG C 
ACCCCTCAGA G 
CCGGGGCCGT G 
AGGCTTCGCC C 



TTGGCGGGCA 
ATTCAATTTT 
CCGAAGCGAG 
CTCCAGCCAC 
TCCAGGCATG 
CCCTGGAAGC 



CCACCGCTTC 
AGAACCCCCA 
CCCAGACCCA 



CTGGGCTACG 
CTGCGCCTGC 
GCACACAAGG 
GCGGGAGTCG 
CCTCTATTGG 
GTCCTAGCGG 
ATCCAGGCCA 
ACACCCCCAA 
CCTACTGAAT 
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CTCGAAGCTG CAGTCAAGGC CCCCCCAGTC 
GGAAAAAGTA CAAGTACATC GTGCTAAACT 
GGGAGAGAAG TTCTGGTCAA CCTTGCCCCC 
CCAGCAGCAG CAGCAGCAGC AGCAGCAGCA 
GCAGGCTCTC TCCAACTGCT GCCACTGTGC 
CCTACCTCCT CACATCCCAG GCTCAAGACA 
CACTACCGGG AAGTGAATTT TTCAGCTGCC 
CGGGGCTGGA CTCCTTGGTT CCTGGGGACG 
GGTCTTCGTT CCGCTACAAG GGCAACCTTG 
AGCCTTACCA CTGCTCAATC TGCGGAGCCC 
ACAGCCGCAT CCATTCGGGA GAGAAGCCGT 
TACAGGTGGC ACATCTGCGG GCGCACGTGC 
GCCCTACCTG CGGAACCCGC TTCCGCCACC 
ACACCGGAGA GAAGCCTTAC CACTGCGACC 
AACTGCGGCT GCATCTGCGC CAGAAACACG 
ACATTCTCGG GGGGCCCTAG CTGAGCGCAG 
GAAAGCTGCA GGCCCAGGCC TTGCTTCCCT 
CACTTTGGTA TCAGAAATTG CCACCCTCTT 
GATCCTGGCT AGATCTGCCT CTGTTTTGCT 
GTTTCTGAGG AGAGAGCTAG CTAGGGGCTG 
CCTAAGGGAA TAGCCCTCCA CCTGTGGCCC 
TTTATTGAGG CCTTTGGGTG GCACCGGGGC 
TTCCACAAGT GTGATTAAAA GTGACCAGAA 
CAGAGATTAC TAGCCCTTGG CTCTCTCGTT 
TAACTTTTAT CTTTAGAATT GTTCTTTCTC 
TGGAAAAAGG GGTTCTCTGT GTTCTGCCCC 
TCTAGGGCAG CTCTGGGAAC ATGCGGGATT 
TTCTGGATGT TGTAGGTTCT CTAGCAGTCT 
CAAGGGTGAT AGGAACCATT ATGTTGAGCC 
GAGGCTGTGG GTGTGGGGGA TTCTGTATCT 
GGGTGTGGGG GATTCTGTAT CTGGATTCCG 
TCTGCAAGAT GGTCCAGAAT CTAAAATGTC 
GCTGTATCCA TCTATAGTGG TAGAGACCCA 
CACGGGGGCC TGTTCTTAGC ACTGAGTTGA 
TTATCAGAGA TGATGTGACC TTTTCTGACT 
GGGAAGAATC ATGAAACTCT TTAGCTTGAT 
ACTACAGAGG CATATGGGTT TGAATGTTAC 
TCTTCCTTTA GTGGGTTTTG GACATCTTCT 
TCCTCTAGAA GGGATGGTGC TTGGTAACCT 
TCTTCCCATC CCTGCATTCC TGTCTGGAAC 
AAGAAAAGGG GCTGAGTTCC ATTCTGGGTT 
ATTACAGATG TAAAAGATTG ACTAGCCCAT 
TTCAAGTAGG ATTAAGAGGT TGGTTGAGGG 
GAAAGTGAGG AACAGGGTTG CCTCTTGGCT 
GCTGAAGCCT TGATTGATAG TTCTGCCCCT 
CGAGGGTAGA AAGTAAGAAG CACTTTTGAA 
GTTCTAGTGG CTGTCGCCTG GGGACTAGTG 
TCTCCCCATG GCCCCACTGC AGAATTAAAG 
AGAAGGAATC ATGATTTCTA TTTAGCAGAT 
AGAAATGTTA GATCTTGCAA CATCAGATCC 
AAAAAAAAAA AAAAAA 



CAGCCAGCCC TGACCCCAAG GCCTGCAACT 720 

CTCAGGCCTC CCAAGCAGGG AGCCTGGTCG 78 0 

AAGCCAGGCT CCCCAGTGGA GACGAGGCCT 840 

GTGAAGAAGG ACCCATTCCT GGTCCCCAGA 900 

AGTTCAAATG TGGGGCTCCA GCCAGTACCC 960 

CCTCTGGATC ACCCTCTGAA CGGGCTCGTC 102 0 

AGAACTGTGA GGCTGTGGCA GGGTGCTCAT 1080 

AAGACAAACC CTATAAGTGT CAGCTGTGCC 1140 

CCAGTCATCG TACAGTGCAC ACAGGGGAAA 1200 

GTTTTAACCG GCCAGCAAAC CTGAAAACGC 1260 

ATAAGTGTGA GACGTGCGGC TCGCGCTTTG 1320 

TGATCCACAC CGGGGAGAAG CCCTACCCTT 13 80 

TGCAGACCCT CAAGAGCCAC GTTCGCATCC 144 0 

CCTGTGGCCT GCATTTCCGG CACAAGAGTC 1500 

GAGCTGCTAC CAACACCAAA GTGCACTACC 1560 

GCCCAGGCCC CACTTGGTTC CTGCGGGTGG 1620 

ATCAGGCTTG GGCATAGGGG TGTGCCAGGC IS 80 

AATTTCTCAC TGGGGAGAGC AGGGGTGGCA 174 0 

GGTCAAAACC TCTTCCCCAC AAGCCAGATT 1800 

GGAAAGGGGA GAGATTGGAG TCCTGGTCTC 1860 

CCATTGCATT CAGTTTATCT GTAAATATAA 192 0 

CTTCATTCGA TTGCATTTCC CACTCCCCTC 1980 

ACACAGAAGG TGAGATCACA GCTCTGCTGG 2 040 

TGGCTTGGGT ATTTTATATT ATTTCTGTCA 2100 

CTGTTTGTTT GCTTGTTAGT TTGTTTAAAA 2160 

TGTAATTCTA GGTCTGGAAC CTTTATTTGT 222 0 

GTGGAATTGG GTCAGGAACC CTCTCTGGTA 2280 

AGAAATGGAT ACAGACATTT CTCTGTTCTT 234 0 

CAAAATGGAA GTAATAATAA ATGCCTCCTG 2400 

GGATTCCGTA TCACTCCAAC TGGAGGCTGT 2460 

TATCACTCCA AGTGGAGGCT GGCAGGTTTT 2520 

CCATTAATCT GGTCACTTGG GTTTGGCTCT 25 80 

CCAGGGCTCA AGTGGAGTCC ATCATCCTCC 2640 

TCGCTCCATG GGGGAGAGAT CAGACATTCC 2 70 0 

CTGCCCAGTC TCTATGAATG TTATGGCCTA 2 76 0 

TAGATGGTAA ACAGTGTTAA CCCATCCTTT 2820 

CTGGGGTTCT CTCTATTGAG TTGAGCCCCT 2880 

GGCAAGTGTC CAGATGCCAG AACCTTCTTT 2940 

TACCTTTTAA AAGCTGGGTC TGTGACCTGG 3 0 00 

CAGTGAATGC ATTAGAACCT TCCATAGGAA 3 06 0 

TGCTGTAGTT TGGTTGGGAT TATTGTTGGC 3120 

AGGCCAAAGG CCTGTTCTAG TTGACCAAGT 3180 

GTGCAGTTTC TGGTGTAGGC CAGGTAGGTA 324 0 

GGGTGGAGTC TCTGAAATGT TAGAAGAAGC 3300 

TGTTGCCCTG GGGCTTATCT GATTATGGGA 3360 

TTTGTGGGGT AGAACTTCAA CAATAAGTCA 342 0 

AGAAAGCTAC TCTTCTCCCT CTTCCCTCTT 34 8 0 

AAGGAAGAAG GGAAGGCGGA GGAGTCTATA 354 0 

TGGATGGGCA GGTGGAGAAT GCCTGGGGGT 3 600 

TTGGAATAAA GAAGCCTCTC TGYGCWRAAA 3660 



Seq ID NO: 24 Protein sequence: 
Protein Accession #: FGENESH predicted 



1 11 21 31 41 51 

I I I I I I 

MGSEAAPEGA LGYVREFTRH SSDVLGNLNE LRLRGILTDV TLLVGGQPLR AHKAVLIACS 
GFFYSIFRGR AGVGVDVLSL PGGPEARGFA PLLDFMYTSR LRLSPATAPA VLAAATYLQM 
EHWQACHRF IQASYEPLGI SLRPLEAEPP TPPTAPPPGS PRRSEGHPDP PTESRSCSQG 
PPSPASPDPK ACNWKKYKYI VLMSQASQAG SLVGERSSGQ PCPQARLPSG DEASSSSSSS 
SSSSEEGPIP GPQSRLSPTA ATVQFKCGAP ASTPYLLTSQ AQDTSGSPSE RARPLPGSEF 
FSCQNCEAVA GCSSGLDSLV PGDEDKPYKC QLCRSSFRYK GNLASHRTVH TGEKPYHCSI 
CGARFNRPAN LKTHSRIHSG EKPYKCETCG SRFVQVAHLR AHVLIHTGEK PYPCPTCGTR 
FRHLQTLKSH VRIHTGEKPY HCDPCGLHFR HKSQLRLHLR QKHGAATNTK VHYHILGGP 



Seq ID NO: 25 Nucleotide sequence: 
Nucleic Acid Accession #: U21551 

Coding sequence: 1..1155 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I 1 I I I I 

ATGGATTGCA GTAACGGATC GGCAGAGTGT ACCGGAGAAG GAGGATCAAA AGAGGTGGTG 60 
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GGGACTTTTA 
GACCCCAATA 
TCCTCAGAGT 
CCTGGCTCAT 
GGAGTAGATA 
TCTGCTGTGA 
CAGCTTGTGA 
CGTCCTGCAT 
CTCTTTGTAC 
TCCCTGTGGG 
ATGGGAGGGA 
CAGCAGGTCC 
CTTTTTCTTT 
GGCATCATTC 
GAATTTAAGG 
AACAGAGTGA 
ATACTGTACA 
AGCCGCATCT 
ATTGTGCTAT 



AGGCTAAAGA 
ATCTGGTTTT 
TTGGATGGGA 
CAGCTTTGCA 
ATAAAATTCG 
GGGCAACTCT 
AATTGGATCA 
TCATTGGAAC 
TCTTGAGCCC 
CCAATCCCAA 



TGTGGCTCTA 
ACTGGATAAA 
TTCCAGGAGT 
TGTCAGAGAG 



CCTAATAGTC 
TGGAACTGTG 
GAAACCTCAT 
CTATGCAGTG 
ACTGTTTCAG 
GCCGGTATTT 
AGAATGGGTC 
TGAGCCTTCT 
AGTGGGACCT 
GTATGTAAGA 
ATCTCTTTTT 
TGGCAGAGAC 
TGAAGATGGA 



ACACCAGCTA 
TTCAGGGATC 
ATCAAGCCTC 
GAATTATTTG 



AAGGCGAGAC 
TGAGCAAATT 
CCTGA 



ATACCTCACC 
TAGCTCTGGT 
AATACACATT 
AACTGATATC 



GACAAAGAAG 
CCATATTCAA 
CTTGGAGTCA 
TATTTTTCAA 
GCCTGGAAAG 
GCCCAATGTG 
CATCAGATCA 
GAAGAAGAAC 
TGCATTCTGG 
ATGGATGACT 
ACAGCCTGTG 
CCAACTATGG 
CAGTATGGAA 



CCATTTTAAA 
ATATGCTGAC 
TTCAGAACCT 
AAGGATTGAA 
ACATGGATAG 
AGCTCTTAGA 
CATCTGCTAG 
AGAAGCCTAC 
GTGGAACCTT 
GTGGAACTGG 
AAGACGTAGA 
CTGAAGTGGG 
TGGCAACTCC 
ACCTGGCACA 



GGAAAAACCA 
GGTGGAGTGG 
GTCATTGCAC 
GGCATTTCGA 
AATGTATCGC 
GTGTATTCAA 
TCTGTATATT 
CAAAGCCCTG 



GGACTGCAAG 
TAATGGGTGT 
AACTATGAAT 
TCCACTAGAT 



TTGTTTGCCC 
AGAATGGTCC 
GAGAAGAGAG 



CCTGGAGGGG 
AGTTTCTGAT 
TAAGCTGGCA 
CGACTGGACA 



11 



31 



51 



I I I I 

MDCSNGSAEC TGEGGSKEW GTFKAKDLIV TPATILKEKP DPNNLVFGTV FTDHMLTVEW 
SSEFGWEKPH IKPLQNLSLH PGSSALHYAV ELFEGLKAFR GVDNKIRLFQ PNLNMDRMYR 
SAVRATLPVF DKEELLECIQ QLVKLDQEWV PYSTSASLYI RPAFIGTEPS LGVKKPTKAL 
LFVLLSPVGP YFSSGTFNPV SLWANPKYVR AWKGGTGDCK MGGNYGSSLF AQCEDVDNGC 
QQVLWLYGRD HQITEVGTMN LFLYWINEDG EEELATPPLD GIILPGVTRR CILDLAHQWG 
EFKVSERYLT MDDLTTALEG NRVREMFSSG TACWCPVSD ILYKGETIHI PTMENGPKLA 
SRILSKLTDI QYGREESDWT IVLS 



Seq ID NO: 27 tj 

Nucleic Acid Accession 
Coding sequence: 656.. 2758 (underlined s 



: and stop codons) 



TGACGTCAAG 
AAGAGGGAAC 
CAGCATCATC 
GTTTCTGTTC 
TGTGGCTGTC 
AGCCGCAGCA 
GCACAACTTT 
TATTTTTTGC 
TGGAGTGCCC 
CAGTGGCGTT 
GAAGATGCTC 
TGCTAAGTTT 
GAACCCCCCG 




TGACAGCCCG 
ATGTGGGAAG 
GTTCCACTCA 
AGACTATTGC 
AACTGCGGAT 



TGACAAAGTG 
TGTGAGTGGG 
CTTCATTCTG 
GGAGCCTTAT 
AGGACTGCTA 
CTATACCACC 
GGAATACACA 
TCTTGAAGTT 
CGGCTTTTTG 
GGATGGGTTA 
CAACGTGCCT 
CGAAGTGTTT 
TGATATAAAC 



GAAAGCCCCC 
ATCTCGCTCC 
AAAGTTGCAT 
TACAGCCCCG 
CCCCCCCATC 
TCCTTTAAGC 
GGGGAAAGAA 
AAGCGCCTGA 
GAGATGCTGT 
GGGCTAGGGC 
TTACTGGAGG 
CCTGAGAGAG 
AAAGAATTCT 
GAGTTTTGCT 
AAACAAGTCA 
GAAGAGATCA 
CTGCGGCAGC 
GAAAAAGAAG 
TTGGACATTC 
AGCCTCGCAT 
AACCAAGAAC 
GTATCCAGAA 
GCAGAACTCC 
TACATCATTC 
AGTGATTTCA 
TATTCCATAC 
GCTCATGGGC 
ATCAATTTAA 



CCCCTTCCCT 
TGGCAGCACC 
AGGCTGTGTG 
CGCTGGCCCT 
CCCACCTAAA 
CTTCCTCCTC 
AACCCAACTG 
TCGGGCTCCC 
CGCTGTACAT 
CAAACTCCTC 
CTCCCGCGCC 
TGCTGCTGCT 



AGAAGCTGCA 
GCCTCCGCCT 
CAACTCCCGT 
TTCCAACTCC 
ACACTGGCAC 
CTAAGGCATT 
ATTTTTGTCC 
CTGGAGCTGC 
CAGCCCCTGC 



I 

CGGGGGCGCG 
CTAGTTAAAA 
GCCGCCGGCA 
GCCCCGCCGC 
TACACGGACA 
TTCTCCTCCT 
AACTGCAAAC 



CCGCCACCTC 



AAAGGAGAGA 
GCGGTGGCTT 
GCCTGGAGAA 
AAATCAAATG 
AAGTCTTGGA 
TTTACACTTG 
TTTACTATGC 
GAGGACCAGC 



CAGGAGGATG 



TAAGATATTT 
TGCACTTTGC 
AAGAGACCTA 



CCGTTGGTGC 
GTTATGTGAA 
ACAAACTTGT 
TCCATCCCAA 
GGTGGGCTAT 
AAAAT CCACA 
ACAGAAAGCA 
TTGGTGATGG 
CAGGCTCAGT 
CAAGGAGCAA 



CGATACTGTG 



CAAACACAAC 
CCTGCATAGT 
GATACTTACC 
TCAAAGTGGA 
TTACAAGAAA 
CGGGCCTCAT 
CCAAGTTGAT 
TCTGGGAGGA 
GATGATTACA 
GCTACGGCTG 
CCCACACTTC 
AGGCAGATGT 
TTCAGACTCC 



ATGTCCCAGC 
CTGTCCTGCT 
TCTGTTACCA 
TCTCCACATT 
GTACTTCCTC 
ATTCCAGGTT 
GGTGGGTTGT 
TTGGACCAGA 
TGCTTCTGTA 



CGAGAAGCGG 
AGAAGAAGAA 
GAGGAGACCT 
CGCCGTCGCC 
AGTGAACATC 
CCCACTTCCC 
GGTGTCATCC 
CCGCGTCTTT 
CCTCTGTCTC 
CCCTGCTGGG 
AGACG ATG CT 
TTGAAGGAGA 
GCCTGAATGG 
TGGAGCTGCT 
GCCTGCGGAG 
ACAACACAGA 
CTCAAAGCCT 



TTGAGAACAG 
CAACTGCTCT 
CTGGATGATA 
GATGTGGACA 
AACAGCACCA 



AATGGAAAAA 



TCCTTCAAAC 
GCTTTCCAGA 
TGGAAGAATA 
TTCAGGAGGT 
CGCAACGTCT 
AAATTTTCAA 
GAGATGAAAG 
TGTATGTGTC 
TTAGGGTTGT 
CCAGAGTCTT 
TTGGCCCTGA 
TGGAAGAAAT 
CAGACATGTG 
ACCAGCCCCC 
GACATCCCAC 
ACAGATCATC 



1020 
1080 
1140 



1500 
1560 
1620 



1860 
1920 
1980 
2040 
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AGCCAGAATT CTACAGATAA 
ATTCAAGCCA 
AGAAAGATTG 
GCAAAGTCCT GTGACAAAGC 
CTGTAGAGGC TACTTTTCCG 
TTACATTTTA TCAAGCAGTA 



GACACTGACT 
GTGCTGCTGC 
ATGTCGTCAT 
TGGTCCTCAA 
TGATCAGATC 
TCTGGGACTG 
AAAAAAGACT 



TCAGAGTGCT 
AGTCCAGGCT 
GGAGGTGTCT 
TGTGAACAAG 
ATTGACATGA 
TTTGAATATT 



TAAAGGGGAA 
GTCCTTTGGT 
ACGTGTTTGG 
AGTGGCAAGA 
GTCACATCTT 
AAAGTATGAC 
TAATGCCTGA 
CCAGGCTCTG 
GGGAGGGGGA 
GTGTTAGACC 



AGATTATGAA 
TGGTGGATTT 
AGAT CGTAAT 
AAAACCACTC 
GGGATTTGGA 
CCAGACTCAC 



CACATACACA 
ATATACTTCC 
TTTTATGTTA 
AGTCAACTTT 
AAGGCAATAT 
GAATTTCTAA 
GTTACAGAAT 
AT CT C AAGAT 
TCTTCCTAAA 
TTACATATTT 



AGAAATGTGC 
TACTCATAAC 
TTATGCAAAG 



AATAGAGTTT 
TTTTATATTA 
GTGAGCAACT 
GCTACACACT 
TGTTTTCAAG 
AGGTCTGCTT 
ATATATTTTA 

3 Protein a 



CATCTTACTT 
CTATTCCAAT 
TACACACTCC 
AGATCCTCTG 
CCCTATATGC 
TAATTTACAC 
TTATTTGACT 
TGAAACAGTA 
AAGTACTATA 
TGATATAAAA 
TACCTTTTTA 
TGTTTTATAA 
TTATTGTATA 
TTTTATTTTT 



TCGAAACGGC 
CTTCTGCAGA 
GAACAAGTGC 
CATCCGCAGA 
GCTGGATCTA 
GGGCATTTAT 
TGTGATTTCA 



GGGAATTTCC 
TGTCTCGGCA 
GAAGATGAAC 
AATGGAAAAC 
GCCACGGTAC 
TACTGCACCC 



CACTTTTAGA 
GCTGCCAGTC 
TAACTCTCCA 



CTCTGTAAAA 



ACAAGTTACA 
TTTTTATCCT 
TTCTCTTTTA 
AGCATGTTTG 



CTGTGCAATC 
CTAGGAGAGA 
TTGTAATCTT 
TTGGCTGAGA 
TTAAATCATA 
TTTTATTTAA 



TTGTAAATTG 
TTTCTGTCTG 
CGATGGATCT 
ATGTTTCAGA 
CATTTTTGTC 
AATCTGGTTA 
AT AC- CAT ATT 
CAATAGGCAC 
CATCACCTAG 



TAGGTGAAGT 2340 
TCTACAAAAT 2400 
AACCTGCACA 2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 

TTTTTAAAAT 3000 
ATAATGGATT 3060 
TAATCACTAA 3120 
AATTAAAAAA 3180 
ACTCCCTGAT 3240 
AGTGTATCCA 3300 
TTTCATCTTA 33 60 
TTAAAATCAA 342 0 
TGGGTTTGTG 34 8 0 



GTGAGCCAGC 
AAGGATATCT 
CAGGTATTCT 
TTGT ATAG TT 
GTCATTAAAA 
TTAATTTAAA 



Protein Accession #: XP_039209 



MLKMLSFKLI, 



SLFHSPEREV 
PDFPRKQVRG 
RLFILEKEGY 
VSYTTNQERW 
PDSFLYIILG 



LEFKPFSNGP 
GSCRGYFSGH 
AQTLTSECSR 
YLGPQCEQVD 



LLAVALGFFE GDAKFGERNE GSGARRRRCL 
GFYPRLSCCL RSDSPGLGRL ENKIFSVTNN 
LERDLVLPLL CKDYCKEFFY 
PASNYLDQME EYDKVEEISR 
VKILTPEGEI FKEPYLDIHK 
WEYTVSRKN 
EMDGLSDFTG 
PTDIMINLTI 
QSERLYGSYV 



DGMITLDDME 



LVGGFVYRGC 
ILGFGEDELG 
LCRNGYCTPT 
RNIRRVTRAG 



MTQTHNGKLY 
GDFCRTAKCE 
YLLDLTSYIV 



41 

I 

NGNPPKRLKR 
TECGKLLEEI 
QTTADEFCFY 
EWSGLRQPV 
ERGLLSLAFH 
VFLEVAELHR 
MCNVPYSIPR 
SSARILQIIK 
LQQSPVTKQW 
KIVDPKRPLM 
PACRHGGVCV 



KCALCSPHSQ 
YARKDGGLCF 
GALHSGDGSQ 
PNYKKNGKLY 
KHLGGQLLFG 
SNPHFNSTNQ 
GKDYESEPSL 
QEKPLCLGTS 
PEECRATVQP 
RPNKCLCKKG 



Seq ID NO: 29 Nucleotide sequence: 
Nucleic Acid Accession #: NM_024756 

Coding sequence: 75.. 2924 (underlined sequences correspond to start and stop codons) 



AAGACAACGT 
GCCCCACCAC 
GGCTGCTGGG 
GGACACCTGG 
ACTGGTGCCC 
AATTCCTCAT 
AAGTCATGTA 
CTTTGGCCTG 
TGGCAATCCC 
TCAGCTTCAA 
AGCAGGAACA 
CAGGCCTGTG 
CAGGGCACGA 
TCCTACAAGT 
CCCAGGCCAT 



AGGTCCAGGA 
ACGCCCAGCA 
AATTGAAGAG 
CAACGCCTGG 
TGCAGAGGAA 
ACACCCTGGA 
ACTCCGAATC 
TGCAGGTGAA 



.CACTAGCAGT 
CAAGATGATC 
GGCATGGGCC 
GGTCTGGAAG 
CTACCCAATG 
CCACTCGCAG 
CCGCATGGCC 
GAGGTGCTGC 
TGAGCCTGCA 
ACCTGGCCAC 
TCTGCTGGGA 
GAAAGCCCTG 
GTTCCCTGAT 
GCATTTCAGC 
AAGAAACCTG 
TGCCGTGGCC 
GAACACTCAG 
CTTTACCCTG 
GCTGCACAAG 
GGCTGGGGCA 
CCTCTCAGAG 
GGACATGAGG 



21 
I 

TTCTGGAGCT 
CTGAGCTTGC 
CAGGCTTCCA 
GCAGAGGCTG 
TCCAAGCTGG 
CAGCCGTGTC 
CACAAGCCAG 
CCTGGCTACA 
GATCCTGGTG 
CTTGCTGCAG 
GATCTCCAGA 
CCTGGTAACC 
AGATCCTTGG 
CCCATCTGGA 
TCTCTTGACG 
AGGGCTGACT 
AGAGTGGGTC 
CACCGCTCGA 
GCTCAGGAGG 
AGGCCTGAGC 



31 
I 

ACTTGCCAAG 
TGTTCAGCCT 
GTACTAGCCT 
AGGACACCAG 
TCACCTTACT 
CGCAGGGAGC 
TGTACCAGGT 
CGGGCCCCAA 
ACAGCCACCA 
TGATCAATGA 
ATGATGTGCA 
TCACAGCTGC 
AGCAGGTGCT 
GGAGCTTTAA 



I I 
GCTGAGTGTG AGCTGAGCCT 
TGGGGGCCCC CTGGGCTGGG 
CTCTGATCTG CAGAGCTCCA 



AGCTCTTTGC 



CTGCGAGCAC 



GGTTGAGGTG 
CCGGGTGGCA 
AGTGATGGAA 
GCTACCCCAC 
CCAAAGCCTG 



AAAACAGAGA 
CAGAAAGTCA 
GTGCTGACCT 
CACGATTCCA 
GATGGACCAG 
CAACAGGAAC 



GCAAATCAAA 
GTGGACACCT 
CACAGCCTTA 
ATCTCCAGAG 



TCCAGGAGCT 
AGCTGCGACA 
TCTCAGAGCT 
CCCCAGGGAC 
CGGACAGCCT 
CCACGGCCCG 



TGGTGCCAAA 
GGACGTGGAG 
CCAAGCCGAT 



CCACACGGCG 



GCCACCCTGA 
TTCGATCAGA 
CTCCGTGAGC 



GCAGGCCAGG 
CAGGGAGGAG 
GGATGAGATC 



TGCGCGTGAT CCTGATGGAG 



GACCGCCTGC 
GTGGACACCA 
CTGGTGTTGG 
CTGGGCCAGC 
GAGTTGCAGT 
AAGGAACTGT 
GTGGAGGAGC 
AAGTCTCTGA 
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TCATGGAGGA 
AGCACCTGCA 
AGCTCTATTT 
AGGAGACCCA 
TGCAGAACGC 
GGGCGCGGGC 
GCGCGCTGAA 
TCGCCGCCCT 
AGGAGGTGCT 
AGATCCGCGT 
ACGAGCTGGC 
AGCACCTGGA 
GGCTGGCGCG 
AGGCCGAGGC 
CACTCTTCGC 
GGAACTTCCA 
TGCTGAGCAG 
AGAAGGAAGC 
GCGCGGCGCT 
CGGCTGCCCT 
TCCCTGAACA 
TTGAATTTGG 
CAGTCTGTAC 
TGCAGAAGGG 



GAACAAGGAG 
GGGTGGCCAT 
AGACCTGGAC 
GGTGAGCCTG 
CGTGGACGCC 
GGCCACGTCG 
GGCGGCCGCG 
GCTGGAGGAC 
GGAGGAGATG 
GGCCCTGCAG 
CGCCCGAGTG 



GAGGTGGAGC 
GCCGACCTCA 
GTCATCCGGG 
GACGAGCGGC 
GTGTCGCTGG 



TCAAGTACGT 



GCCGAGGCCC 
GCGCTGCGGC 
TCTGAGCAGA 
GACGCCGCTA 
ACGGCCCTGG 



GGCAGCTGGA 
CCGTGGACGC 
GCCAAGTGCA 
GCCACGAGGT 
ACGAGGCGGT 



GCGGGCTGCA 



CACTCAGCGC 
AGGGCTCATG 
GAAAGGGAAG 
GGAGCCTTTG 
CTGGGAGGCA 
GCAGACAGTG 
TGGCTACTTC 



GATCAGACAT 



AATGCTTCTC 
TCAGACAGCA 
GTGTTTTGCT 
TCCTCTTCTC 
AAAATGGAGG 
AACCTGCTTG 
GAAATTAGTA 
CTAAAATATC 
GAGGTAATTG 
TCTGATGATT 
TGAAGATGCC 
CAGCCATGTG 



CACTGGGCAG 
TGAGCGAGTA 
TGCATTTGGG 
CATGGACTCG 
TCAGTTGGTC 
CCCTGTAACC 
CGGGCAGGAC 
CGGCCTGGGC 
CCTCCTGTGG 
TTCTCCCTTA 
CAACATTTTG 
GTCTCTATTA 
TCTTTAGTTT 
ATCTTGAATT 
GATCTTGGGG 
TTATAAGTTT 
TTGGTTCCTC 
GAACAGTGAG 



AGCCTGAGCA 
GCCGCCTCCC 
AGCTTGGAGC 
GAAGCCAACG 
AAGCAGCAGA 
GTGGACATAC 
GGATCCCCTG 
AAGTTCAACA 
CGAGCCCCTG 
GGCACCGGGC 
GGGAGTGGAA 
TGGTTTGAGT 
GGCTTCCTGA 
CCCAGCTCTC 
TGTCTCTTCC 
ACATGGGGCT 
ATGGTCCTAG 
TCCAACTCTT 
TTGGAAACTT 
TCGTATGATA 
GCCAACATTG 
GTCAGTAATG 
CAGTCATTCC 
GTAATCCCTA 
GCGGTTCCCC 
GATAGTTCCT 
TTCACTGTCT 
TCAATTAAAC 



AGCTGGTGTT 
GCACAGCAAC 
TAACCCAGGG 
TGTTTAAGAC 



GGAGCTCAAC 
GAAGGACTGC 
GGACGCCACG 
CGGCTCCTCC 
GCACAAAGCG 
GGCGCTGGAT 
GCGCCAGCTG 
GCTGGCCGCG 
GCTGCCCCTG 
GGAGCAGGCG 
GGAGCCCCCG 
CGCCACCACC 
GAATGTCGGG 
CCTTGACGGC 
GCTCTTCCAC 
CCTGGGGAAG 
AGCTCCCCGG 
GCCTGTGCCA 
TGCCAGCTTT 
CAACATTGGC 
CTACCTGTTT 
TGGAGGTCAC 
GGTCTTTGCC 



CTCACGCTGC 
AATTGCCAGA 
CGTGCCCTGG 
CTGCAGGCCC 
GAGGGCGAGC 
GACGAGGTGG 
CACAGCGCCT 
CTCTTCGGGG 
AGCTACGAGC 
CTCGGCTGGG 



GCCCTGGCCG 



CTCCACAACG 
AGCCTCTTTG 
CTGCAGACCA 
AAGAGGGACA 
GGTGCCTTGG 
TCAGAAGGGA 
AGCAGCTACT 
GCAGTGAGCG 



CTGGAAACCT 



GAAGCCTGAA 
CACCACACCC 
CTGTACAACA 
GAAAGACATT 



CTGAACCCCA 
GGCTCTGGCC 
TCT2CAAAGA 
TCCATGATGA 
CCTTGGCTTG 
TGTATTCTAC 
CTTTAAACTT 



TAATCCCCAC 
CATGCTGTTC 
CCTGTGTTCA 



TTGTGATAGT 
TTCTCCTTCC 
TAAGTTTCCT 
TATAAATT 



ATGGCTGAGC 
AAGAGAAGCC 
GCCCCAATCT 
AAGGATGGGC 
TGGTGTGGTG 
GAAGGACTGG 
GCATGCCTTC 
AACTTCTTTG 
TTCTCTTGCT 
AGGAATGTTT 
GGATTAAACC 
CAAGGGAAAG 
GTGCCCCCAC 
AGATCAGGTG 
TCTCACGAGA 
TGCCACCTTG 
GAGGCCTCCC 



2040 
2100 
2150 
2220 
2280 
2340 
2400 
24S0 
2520 



30S0 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3730 





I 1 


21 


31 


I 1 


51 




1 

MILSLLFSLG 


GPLGWGLLGA 


1 

WAQASSTSLS 


1 

DLQSSRTPGV 


WKAEAEDTSK 


1 

DPVGRNWCPY 


60 


PMSKLVTLLA 


LCKTEKFLIH 


SQQPCPQGAP 


DCQKVKVMYR 


MAHKPVYQVK 


QKVLTSLAWR 


120 


CCPGYTGPNC 




PADPGDSHQE 


PQDGPVSFKP 


GHLAAVINEV 


EVQQEQQEHL 


180 


LGDLQNDVHR 




ALPGNLTAAV 


MEANQTGHEF 


PDRSLEQVLL 


PHVDTFLQVH 


240 


FSPIWRSFNQ 


SLHSLTQAIR 


NLSLDVEANR 


QAISRVQDSA 


VARADFQELG 


AKFEAKVQEN 


300 


TQRVGQLRQD 


VEDRLHAQHF 


TLHRSISELQ 


ADVDTKLKRL 


HKAQEAPGTN 


GSLVLATPGA 


360 


GARPEPDSLQ 


ARLGQLQRNL 


SELHMTTARR 


EEELQYTLED 


MRATLTRHVD 


EIKELYSESD 


420 


ETFDQISKVE 


RQVEELQWH 


TALRELRVIL 


MEKSLIMEEN 


KEEVERQLLE 


LNLTLQHLQG 


430 


GHADLIKYVK 


DCNCQKLYLD 


LDVIREGQRD 


ATRALEETQV 


SLDERRQLDG 


SSLQALQNAV 


540 


DAVSLAVDAH 


KAEGERARAA 


TSRLRSQVQA 


LDDEVGALKA 


AAAEARHEVR 


QLHSAFAALL 


600 


EDALRHEAVL 


AALFGEEVLE 


EMSEQTPGPL 


PLSYEQIRVA 


LQDAASGLQE 


QALGWDELAA 




RVTALEQASE 


PPRPAEHLEP 




TTALAGIiARE 


LQSLSNDVKN 


VGRCCEAEAG 


720 


AGAASLNASL 


DGLHNALFAT 


QRSLEQHQRL 


FHSLFGNFQG 


LMEANVSLDL 


GKLQTMLSRK 




GKKQQKDLEA 


PRKRDKKEAE 


PLVDIRVTGP 


VPGALGAALW 


EAGSPVAFYA 


SFSEGTAALQ 


840 


TVKFNTTYIN 


IGSSYFPEHG 


YFRAPERGVY 


LFAVSVEFGP 


GPGTGQLVFG 


GHHRTPVCTT 


900 


GQGSGSTATV 


FAMAELQKGE 


RVWFELTQGS 


ITKRSLSGTA 


FGGFLMFKT 






Seq ID NO: 


31 Nucleotide seouenc 











5 start and stop codons) 



11 



21 



31 



41 



51 



I I I I I I 

GAACGCTCAC AGAACAGGCA gtgcaattcc atgttcctct taagtatgtt ac-ccctaccg 

GGAGCTGAGC TGGCCAGTCT ACTTGGAGAG GAAAAGTAC-A TCTGGGGAAG GTGGAAGGGT 
CAGTTCCTAA GTGACTTCCT CCTCGGGGAT GGTAAGGGCA TTTGCTGATC TCCAGTGACT 
GCCTGGTGCC TCATGGTCAG ACTCGGCTGT CTCACTCCCA GATATCTGAT TTTGCAAAAA 
GGGACACACC TATCTGCAGC AAAGAAGACA CTGACCAGAT TGCGAGCGGT GCTTTTGGAT 
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GCTCTGTAGC CACCCGGGGC CCAGGAGGAC 
TCGGAGACCA TGGCAGTGCA GCTGGTGCCC 
GAGGGCCGCC GATGTCAAGT ACATCTTCTT 
CCCAAGCTGT TGGCCAAGGA GCTTCTTGAC 
5 AAGGAGTACT TTGGAATAGC ATTCACAGAT 
GATCGAAGAG TATTGGAACA TGACTTCCCT 
TGTGTCAGGT TCTATATAGA AAGCATTTCA 
TTCTTTCTGA ACGCGAAGTC CTGCATCTAC 
GTGTTTGAAT TAGCTTCCTA TATTTTACAG 

10 GTTGTGAGGA GTGACTTGAA GAAGCTGCCA 
CCTTCCCTGG CCTACTGTGA AGACAGAGTC 
ACAAGAGGTC AAGCAATCGT AAACTACATG 
GTTCACTATT ATGCAGTGAA GGACAAGCAG 
AAAGGGATCT TCCAGTATGA CTACCATGAT 

15 AGACAGTTGG AAAACCTGTA CTTCAGAGAA 
CGCAGGGCTT CAGTGACAAG GAGGACGTTT 
TATGCATGTC CGGCATTGAT CAAGTCCATC 
TATCTGGACA GAAAGCAGAG TAAGTCCAAA 
GCCATCGACC TGACCGAGAC GGGGACGCTG 

20 AAGGGGAAGA TCATCAGCGG CAGCAGCGGC 
GATAGCTCGC AGTCGGCCAA GAAGGACATG 
CTGGAGGAAA CCCTGCGTCA GAGGCTGGAG 
GAGCTCACGG GCAAGCTGCC AGTAGAATAT 
GTTCGGAGAA GAATAGGAAC AGCCTTCAAA 

25 GAGGAAGCTG AGCTGGAACG CCTGGAACGA 
GCCGCCCGCC GCCTAGCCAG TGACCCCAAC 
ACCTCGTATC TGAATGCACT GAAGAAACTG 
CGCATCAAGT CTGGGAAGAA ACCCACCCAG 
ATTGCCAGTG AAGACAGCTC CCTCTCAGAT 

30 GTTACCAGCA CAATATCCCC CCTACATTCT 

TCGCACAACA GGCCTCCTCC TCCCCAGTCC 
CGCAACGACT ATGACAAGTC ACCCATCAAG 
GAACCCTATG AGAAGGTCAA GAAGCGCTCC 
TTCCCCAGCA CAGGAAGCTG TGCGGAAGCC 

35 CCCATCCGCG GCCTCCCGCA CTGGAACTCC 
CGGGTCCGGA GTCCCCACTA CGTCCATTCC 
CTGCACAGCC TCGCACTGCA CTTTAGGCAC 
CTCCTGGGCT CGGAAAACGA CACCGGGAGC 
AGCAACGGCT CAGACCCCAT GGACGACTGC 

40 CACTACTACC CGGCGCAGAT GAACGCCAAC 
AAGGCGCGCC AGAGGCAGAG GCAGCGGCAG 
TCGGGCAGCA TGCCCAACCT GGCGGCGCGC 
GGCGGTGTGT ACCTGCACAG CCAGAGCCAG 
CCGCTGTACA TCGAGGGCGG CGCCACGCCC 

45 GAGTGCCACT ACAGCGTCAA GGCTCAGTTC 
CTGTTCAAGG AGAGCTGGCG CGGCGGCGGC 
CCGTCGCGAT CGCAGATCCT GCGGACTCCG 
GGCGCGGGCC GTGCCGCCGT CTCAGACGAG 
TCGCACAAGG AGCACAGCCG CCTGTCGCAC 

50 CAGTACAGCA CCTCCTCCCA GAGCACCTTC 
CAGATGTGCA AGGCCACGTC AGCTGCCTTA 
AGTGAAATTG GAGCCACCCC CCCAAGCAGC 
GAAGCAACAG AAAACTCACC CATTCTGGAT 
GATGAATAGA GGAGCTACAA TGATAGCTGT 

55 GCTGATGTCC AGTGGTACGG GCAGGAAAAA 
CCGGCCTAAT CTGACCGCCT CAACGCCATT 
TTACCCAGAC GCACCGTCAC CCTGCACCAG 
CTCCGCATTC CCTCCCCCTT GAAAACCTGA 
CACTGTGTGT CCCCTGGCGC TCTTGCCCAT 

60 ■ CTTGGTGGCT TCCCTCTGCC ATGACAGCCC 
GGCATCCAAT TCCTGCGGAT AAGTAGCGTT 
CAGGGTGACC CAGAAAGACG ATTCAGCTGT 
CAAGCACTTC ATGAAGAGGA GGCCTCGTGG 
GATGGGACAG CTTGTGGGGA TGGCTATGGG 

65 ACACCAGAAA TGCATCGGAG GACCACAATC 
TAAAAACATA AAAAATTAAG AGGGGCCAAG 
TTTTAAATTC TGAACTGCTA CTACACACAA 
CTCTCTCTAG CCCTCTCCCT TACTGGCCCA 
GCCCCAATGC CACGGTAAAG GCGAGGAAGT 

70 ATCCATCTGG ACACAAAGAG AGACCTGTGG 
CATGCAGGGG GTTCAGCCGA GCCCAAGACT 
AACGTAAGGT GATAATGGCC AAAAGTGGTT 
ATCCTATTTT TTTGCATAAG GTGTTTCATT 
ACATTGCGAT CCATTCAGTG TTTAACTGTC 

75 GTGACAAAAG AGCTCAGATC CGACTTCTCC 
TGCCCTTAGG TAGAAAGATT TGACTCGTGT 



TGACTCGGCA GCAGGATTCG TGCATGGGAA 360 

GACTCAGCTC TCGGCCTGCT GATGATGACG 42 0 

GATGACAGGA AGCTGGAACT CCTAGTACAG 48 0 

CTTGTGGCTT CTCACTTCAA TCTGAAGGAA 54 0 

GAAACGGGAC ACTTAAACTG GCTTCAGCTA SOO 

AAAAAGT C AG GACCCGTGGT TTTATACTTT SSO 

TACCTGAAGG ATAATGCTAC CATTGAGCTT 72 0 

AAGGAGCTTA TTGACGTTGA CAGCGAAGTG 780 

GAGGCAAAGG GAGATTTTTC TAGCAATGAA 84 0 

GCCCTTCCCA CCCAAGCCCT GAAGGAGCAC 900 

ATTGAGCACT ACAAGAAACT GAACGGTCAG 960 

AGCATCGTGG AGTCTCTCCC AACCTACGGG 1020 

GGCATACCAT GGTGGCTGGG CCTGAGCTAC 1080 

AAAGTGAAGC CAAGAAAGAT ATTCCAATGG 114 0 

AAGAAGTTTT CCGTGGAAGT TCATGACCCA 120 0 

GGGCACAGCG GCATTGCAGT GCACACGTGG 1260 

TGGGCTATGG CCATAAGCCA ACACCAGTTC 132 0 

ATCCATGCAG CACGCAGCCT GAGTGAGATC 138 0 

AAGACCTCGA AGCTGGCCAA CATGGGTAGC 144 0 

AGCCTGCTGT CTTCAGGTTC TCAGGAATCA 1500 

CTGGCTGCCT TGAAGTCCAG GCAGGAAGCT 1560 

GAACTGAAGA AGCTGTGTCT CCGAGAAGCT 1620 

CCCCTGGATC CAGGGGAGGA ACCACCCATT 168 0 

CTGGATGAAC AGAAAATCCT GCCCAAAGGA 174 0 

GAGTTTGCCA TTCAGTCCCA GATTACGGAG 180 0 

GTCAGCAAAA AACTGAAGAA ACAAAGGAAA 186 0 

CAGGAGATTG AAAATGCAAT CAATGAGAAC 192 0 

AGGGCTTCGC TGATCATAGA CGATGGAAAC 198 0 

GCCCTTGTTC TTGAGGATGA AGACTCTCAG 204 0 

CCTCACAAGG GACTCCCTCC TCGGCCACCG 2100 

CTGGAGGGAC TCCGACAGAT GCACTATCAC 2160 

CCCAAAATGT GGAGTGAGTC CTCTTTAGAT 222 0 

TCTCACAGCC ATTCCAGCAG CCACAAGCGC 2280 

GGCGGAGGAA GCAACTCCTT GCAGAACAGC 234 0 

CAGTCCAGCA TGCCGTCCAC GCCAGACCTG 2400 

ACGAGGTCGG TGGACATCAG CCCCACCCGA 2460 

CGGAGCTCCA GCCTGGAGTC CCAGGGCAAG 252 0 

CCCGACTTCT ACACCCCGCG GACTCGTAGC 2580 

TCGTCGTGCA CCAGCCACTC GAGCTCGGAG 264 0 

TACTCCACGC TGGCCGAGGA CTCGCCGTCC 2700 

CGGGCGGCGG GCGCACTGGG CTCAGCCAGC 2760 

GGGGGTGCGG GGGGCGCGGG GGGCGCGGGG 2820 

CCCAGCTCGC AGTACCGCAT CAAGGAGTAC 2880 

GTGGTGGTGC GCAGCCTGGA GAGCGACCAG 294 0 

AAGACGTCCA ACTCCTACAC GGCGGGCGGC 3000 

GGCGACGAGG GCGACACGGG CCGCCTGACG 3060 

TCGCTGGGCC GCGAGGGCGC CCACGACAAG 312 0 

CTGCGCCAGT GGTACCAGCG TTCCACCGCC 318 0 

ACCAGCTCCA CCTCCTCGGA CAGCGGCTCG 324 0 

GTGGCGCACA GCAGGGTCAC CAGGATGCCC 3300 

CCTCAAAGCC AGAGAAGCTC GACACCGTCA 3360 

CCCCACCACA TCCTAACCTG GCAGACTGGA 342 0 

GGGTCTGAGT CTCCACCTCA CCAAAGTACT 3480 

TTCCTGGATT CCTCCCTCTA TCCAGAACTA 3540 

GCCAAGCCCG GGACCCTCGT GTGAGCCAGC 3600 

CTGAGATCAC CTCACTGCCT CTCATTTGCC 3660 

CTTTGGCCCT CAGCACTTTT TTTCTCCTGT 3720 

CTGAGGAGAC ATTCTGGAAG GTTCCGGTCC 378 0 

AGAGAGCCAG ACACCAATCC TCAATGGCAC 384 0 

CTAGGCCAGG AACCATCAGG GGGGCCAGCC 390 0 

GGGAGAGAAC GGGAAAGGGG ACTTGGGTTA 3960 

GTCCAGCCTG CCACCCATAC GTAGGCCAAC 402 0 

CATATTCAGT TTACACCTGA AATATTCCTT 4080 

GGAAGGGGAG GTTGAGAAAG GAAGTTCTCG 414 0 

AGTTCTATGC TGCCAAAGAT TAAAAATAAA 4200 

AGGAAGACAT TCTTTCTGCA AGGAAATTTC 4260 

GTGAAAGTCA ACCCTATGTA AACTGGTGTC 4320 

CTTCTCTCTC CGTAGAGAGC CTGAAAAACT 438 0 

CTTGGCTGGC GTTGCTGACT CACAGTCGCC 444 0 

GAGTCATAGA GGGTACTGTT AGCCCCGGTC 4500 

CAAAGCTGCT TTCCTTTCAG GATTTGTAGT 4560 

CTCTCTCATT AAACCAACCA GTAAAAGCGT 4620 

TTCGTTTTTA TGGGAAACCA AGGGAAAAGC 468 0 

GTGGCTCATT TTCTGTTCGT TAGCACTTGT 474 0 

TATGTGTCAC TTATTCCAAG AACCCAACTA 48 00 

GTCTACTAGC CAACAGGCAG AGCAGGGTTG 4860 
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AAAAAAATAT 
ACATTTGTGT 
GCACGTTTAC 
GGCCTCAAAA 
CACAAAAAGG 
GGGGTGGGGG 
CCAGGGACAG 
AGCATAAAAA 



CAGCTCCCAA 
GCAGATACCA 
ATGTTTTGAG 



CCTTCTGCTG 



TTTCCGCAGA 
TGGAATTTTT 
GAGCCAGGGT 
CAAAGAAAAA 
GACCAGGAGG 
CAGCCTGGAG 



AGGGCCCATG 
AAAGAGGAGG 
CTATGCTTCA 
AGTAACAAGT 
GGTGGTATGC 



GGGGGTAGTT 
TCTTCGCTTT 
GTAAGGATGG 
ACCACCGAGA 



TGTCTACATC 
AAAGAAGAAA 
AACACAACTG 
GCACGACTTT 
TGTGCTTTTG 
AATGACTTCC 
TTGTGGGAAA 
TCATGTATGT 



GGGCCTGCTT 
TTCATATTCC 
AT TAAATGAT 
AGCTACAAAG 
TGGGCCATTT 
CAGGAAAAGT 
ATGCAGAGTA 
GTTGCAAGGA 
CAGAGAGAAA 
AAGACAGAGA 



TCCCCAGCTC 
CAGAACGCTT 
TCTAGGGATT 
AACAGTGATT 
TTCTTTCTCC 



GTCGAGCTGG 
CTCCTCCACC 
TGGACTGGTG 



TGTAGTTATT 
AGGTGAAATA 
AAAGAAATTG 
GAGCCAGGTT 
AATTTGTTTT 
TCAGATGACC 
AATAGTTTAT 
AGGTTGCTCC 
TGTAATCAAT 
GCTTTTAATT 
AGTGTTGTCA 



CGAAGTCCTT 



CAAAGAAGAT 
AAGGCAGCTG 
TAGTCTGGAG 
AAAATTCTAA 
GAATTACAGT 
CAGGTTTTAC 
TGAGACTGAC 
ACTTGCAAAC 
GGTCTGATAT 
CCAGAAGACT 
TTTATTATAT 
TAGTTTAGTT 
TAGTTAACTA 
GTAAAATGGA 



TCATGGATAG 
ATGAGGTTAC 
AAAACTGGAT 
AGATTTGGGT 
TCCAACAAAC 
TGTTTAGCTG 
ATGATTAATG 
TCTTTATCCC 
AAAGCCCCTA 



TGAAACTGAC 
GAAAAAGATG 
TATTCTTTTT 
ACACTTATTA 



TACGTTGTAG 
GTATTAAGAA 
AAGCATTTTC 



GCAGGTGAGT 
CTAAATATAG 
GGTGTAAAGT 
CTCTAGTTTA 
GTAAAATGAT 
ACAAAGTTGC 
TTTGTATAGT 
AAAATG 



ATCAGTTACT 
AAAATTAATG 
GAAAGCCATC 
AGTTGGGTTA 
GCGCAAGTGG 
TATTGGAAAG 
GCAGAACTGA 
GGAATCCAAG 
GAAACAAATA 
GGCACACACA 
AAGTCTCTCT 
ATTCCAGGGT 

AAAGAGATAC 
GCTTTTACTT 
CATTCAGAAA 
TCAGACTGAA 
ATGGTTACAT 
CAAGATTCTA 
TATATTTTCA 
ATCATGATAG 
AGTTCAGTTA 
ATCAGTGTGG 
TATTTCTTTA 
TTGGATTCTT 
CATACATCAG 
GTTGAAACTG 
CAAACCCCAT 
TTTACATATA 
ATAATGGGGG 
TTTACTGGAT 
AG AAT TAAAA 
ATCTTTACAT 



GTCATGCACC 
TGTGGGAGCT 
AATCTTCAAA 
TTCAAGATGG 
TGGGGGGATG 
GCATTGACAG 
AGTTAGCTTA 
AATAACCATA 
CCAAGGTATT 
CACCTGGCCG 
AGACAATTCA 
GCAGAAGGGA 
CGGTTACATT 
TCATGGTTAG 
CATTATACAT 
CTGTGTGCAA 
GTTCTACATC 
GCCCACTGGA 
ACTTGGGGGA 
TCTGGTAGTC 



AAACAAAATA 
CGGATGCGTA 
ACTATGTTGT 
TTAAAATTCC 
CCAGGTGCTA 
CATTAATTTC 
AAAAAAGATT 
TCTGTTCAAC 
AAAGAAATCT 
TTTGCAAAAC 



5700 
5760 
5820 
5880 



6420 
6480 
6540 
6600 
6660 



I 

MAVQLVPDSA 
FGIAFTDETG 
NAKSCIYKEL 
AYCEDRVIEH 
FQYDYHDKVK 
PALIKSIWAM 



11 
I 

LGLLMMTEGR 
HLNWLQLDRR 
IDVDSEWFE 
YKKLNGQTRG 
PRKI FQWRQL 
AISQHQFYLD 



LASYILQEAK 
QAIVNYMSIV 
ENLYFREKKF 



GKLPVEYPLD 



EDSSLSDALV 
YDKSPIKPKM 
GLPHWNSQSS 



I EGG AT P VW 
SQILRTPSLG 
TSSQSTFVAH 
ENSPILDGSE 



21 31 41 51 

I I ! I 

RCQVHLLDDR KLELLVQPKL LAKELLDLVA S'" 

- GPWLYFCVR FYIESISYLK DNATIELFFL 120 

GDFSSNEWR SDLKKLPALP TQALKEHPSL 180 

ESLPTYGVHY YAVKDKQGIP WWLGLSYKGI 240 

SVEVHDPRRA SVTRRTFGHS GIAVHTWYAC 3 00 

ARSLSEIAID LTETGTLKTS KLANMGSKGK 360 

LKSRQEALEE TLRQRLHELK KLCLREAELT 420 

PGEEPPIVRR RIGTAFKLDE QKILPKGEEA ELERLEREFA IQSQITEAAR 4 80 

KLKKQRKTSY LNALKKLQEI ENAINENRIK SGKKPTQRAS LIIDDGNIAS 540 

LEDEDSQVTS TISPLHSPHK GLPPRPPSHN RPPPPQSLEG LRQMHYHRND 600 

EKVKKRSSHS HSSSHKRFPS TGSCAEAGGG SNSLQNSPIR 660 

SPHYVHSTRS VDISPTRLHS IALHFRHRSS SLESQGKbLG 720 

SDPMDDCSSC TSHSSSEHYY PAQMNANYST LAEDSPSKAR 780 

MPNLAARGGA GGAGGAGGGV YLHSQSQPSS QYRIKEYPLY 840 

YSVKAQFKTS NSYTAGGLFK ESWRGGGGDE GDTGRLTPSR 900 

REGAHDKGAG RAAVSDELRQ WYQRSTA3HK EHSRLSHTSS TSSDSGSQYS 960 



Seq ID NO: 33 Nucleotide sequence: 
Nucleic Acid Accession #: NM_014331 

Coding sequence: 1..1506 (underlined sequences correspond t 



z and stop codons) 



AIGGTCAGAA 



AAGAGGAAAG 
GGAATCTTCA 
ACCATCTGGA 
GGAACAACTA 
TTACCAGCTT 
GTGATATCCC 



AGCCTGTTGT 
TGCCTTCCCT 
TCACTTTACT 
TCTCTCCTAA 
CGGTGTGTGG 
TAAAGAAATC 
TTGTACGAGT 



GGGCGTGCTC 
GGTCCTGTCA 
TGGAGGTCAT 
CTGGGTGGAA 
ACGCTACATT 



31 

I 

TCCAAAGGAG 
GAGCCACCTG 
TCCATTATCA 
CAGAACACGG 
CTATTTGGAG 
TACACATATA 
CTCCTCATAA 
CTGGAACCAT 



GTTACCTGCA 



TTGGCACCAT 
GCAGCGTGGG 
CTTTGTCTTA 
TTTTGGAAGT 
TACGCCCTGC 
TTTTTATTCA 



51 

I 

GGGAAATGTT 
AGTGCAGCTG 
CATTGGAGCA 
CATGTCTCTG 
TGCTGAATTG 
CTTTGGTCCA 
AGCTACTGCT 
ATGTGAAATC 



195 



WO 02/079492 



PCT/US02/04915 



CCTGAACTTG 
AGCATGAGTG 
GCAATTCTGA 
TTTAAAGACG 
TATGGAATGT 
AACCCTGAAA 
TATGTGCTGA 



ATCTTTGTTG 
TTATTCTATG 
CGCAAGCACA 
TTCTCTGGAG 



CGATCAAGCT 
TCAGCTGGAG 
TAATTATAGT 
CGTTTTCAGG 
ATGCATATGC 
AAACCATTCC 
CAAATGTGGC 
CAGTGACCTT 
CCCTCTCCTG 
TTGCGTCTCG 
CTCCTCTACC 
ACCTCGACAG 



TTCAAGGTGC 
CTTTCCCTCT 
GTCCCTGCGT 
TCAGAGAAAA 
TTATGAACTA 
TTTTTACTTC 
CAGTTATTTT 



CACTGTTCAT 
ATTCGGACCC 
ATTATCTCTT 
TAACCAGAAC 
ATGGACTTGA 
ATTTTCTGAA 
TATTCATATA 
AGTTATAGAA 



AAGAGATTCA 
TGGCTGGTTT 
CCTTGCAATA 
CTACTTTACG 
TTCTGAGCGG 
CTTTGGCTCC 
AGAGGGTCAC 
AGCTGTTATT 
TCTTTTGAAT 
GATTTATCTT 
CCCAGCTTTG 
ATTTAGTACA 
TATTATATGG 
ATTACAAATA 
GATCTTGGCA 
AGTCTAGAGA 



AGTATTACGC 
TAGCTCAACT 
TGTATATCCA 
ACCATTAATG 
CTACTGGGAA 



CTTCCAGAAA 
GTTTTGCACC 
TTCCTCAGTT 
CGATACAAAT 
TTTTCCTTCA 
GGGATTGGCT 
GACAAGAAAC 
ATACTGGAAG 
ATCTGCCCAA 
ATTACAACTT 



CTGTAGTGAT 
TAACCTTTTG 
TTAAAGGTCA 
GGTTGCCACT 
TTGTTACTGA 
TGGCCATTGT 
CTGAGGAGCT 
ATTTCTCATT 
GTGTGTTTGC 
TCCTCTCCAT 
CTTTGACAAT 
TTGCCAGGTG 
GCCCAGATAT 
CATGCCTCTT 
TCGTCATCAC 
CCAGGTGGTT 



AACGCAGAAC 
GGCTTTTTAT 
AGAAGTAGAA 



AGCAGTTCCG 
TGTCTCCAGG 
GATTCATGTC 
GATAATGCTC 
GCTTTTTATT 
GCATCGTCCT 
CATGGTTGCC 
TCTGACTGGA 
TAGAATAATG 
AGAAGATAAG 



GTCTCTGATA 
TCTCTACAAC 
TATATATGGG 
TTTTCAATTC 
ATTTTACATT 
GCTTTAATGG 
TTAAAGAAGA 



ATATGTTAGC 
TTTTGTAAAG 
TGAAAAAAAG 



TTATCTGTCA 
AGCAAGAGTT 
TACCCCTGAT 
TGAGAGAAAT 
CTACATGCAA 
TGAATTTTGA 
CTTCAGATGA 
AAGAAATGTC 
ATCTAGGCTT 
CTGATAAGAA 
GTTTTGCCAG 
GCACTTTGGG 
CAACATGGAG 
GCTGGTAATC 
GAGGTTGCAG 
CCATCTCCAA 



ATTATACCCA 
GTTTCTAGGG 
GAGAATTTAT 

AGTTTGGTAT 
GAGTCTATCT 
AACCAACAAA 
TGTTAGTAAT 
CAGTTTGTGC 
AACTGTCCAG 
GCTGTAAATA 
TGTCAGTAAT 
GAAAATTGAA 
TATTAGAAAA 



AGTGAATATG 
GGGGTTAGGA 
ACGGCAAAGA 
ATGGTTTTAC 
CATACATCAT 
TGCTTCCCCT 
GAGCACTTTG 



CAGTTATTCT 
GAAAAGACTA 
ACCTTCAAAT 



TGGTGATAAA 
TTTCTAAGAA 
ATGAGTCGCA 
GACAATTACT 



CAAAAGGAGT 



TATGTCAGAT 
TCACATCAGT 
TAAATCCTCA 
AAACATATGC 
GAAGATGTTC 
TCTGAAGTTT 



ATTAATTAGG 
AGATTTACAA 
TTCCACACCT 
ATGAGAATCT 
TACTGTGAGC 
GGTGGATCAC 
TCTACTAAAA 
AGGAGGCTGA 
TTGCACCACT 



TAG AT AC C AA 
AACAAAGGTC 
ATGAGACACA 
GTTTTTTCAT 
TTGATCAGGA 
TTAGAACAAC 
ATTTTAAGCC 
AAAATAATAG 
TTTAAATTTA 
TACATTTTAT 
AAAAGGCATA 
CTGATGTTTC 
TAATTATCAT 
GTGGATAAGT 
CGGGCATGGT 
CTGAGGTCGG 
ATACAAAATT 
GGCAGGAGAA 
GTACTCCAGC 



TGTCTATACT 
AAGAGGAGAG 
TTTAGATAAC 
AGTGGGGATT 
TCCAGGAGTT 
TCATTATCAG 



CACCTGTTTC 
TTCAAATTAC 
TCCCATATCT 
TGGCTATTTT 
ATTCTTCTGT 
TATTAACATA 
TAGAAAATTT 
TCAACTTGCA 
GTTTGTGTTC 
GGCTTACATC 
GAGTTCTAGA 
AGCTGGGCAT 
TTGCTTGAAC 
CTGGGTGACA 



CAATTCTTGA 
ATGTGGTCAT 
GATTTTTCTG 
GTGAAAAGTG 
AAAGAAATTT 
AAACACTCAT 
GTTGAATACA 
ATGTTTAAGT 
GAAGTTTTAG 
CACATCTTAG 
ACTAATAACT 
ATTATCAACA 
GTAATCATAT 
TACACGATGA 
TAAAATATCT 
AAAATTGCAA 
CCACTTCTAT 
AAAGAGACAA 



1080 
1140 
1200 
1260 
1320 
1380 
1440 



2100 
2160 
2220 
2280 
2340 



TQTAATCCCA 2880 



GGTGGCACAT 
CCGGGAGGCG 
AAGTCAGACT 



YGMYAYAGWF 
NAVAVTFSER 
RKHTPLPAVI 
FKVPLFIPAL, 
SEKITRTLQI 



QNTGSVGMSL 
LLIIRPAATA 
QIFLTFCKLT 



VLHPLTMIML 
FSFTCLFMVA 
ILEWPEEDK 



TIWTVCGVLS 
VISIAFGRYI 
AILIIIVPGV 
NPEKTI PLAI 
IFVALSCFGS 
FSGDLDSLLN 
LSLYSDPFST 



31 

I 

EPPGQEKVQL 
LFGALSYAEL 
LEPFFIQCBI 
MQLIKGQTQN 
CISMAIVTIG 
MNGGVFAVSR 



GIGFVITLTG 



41 

I 

KRKVTLLRGV 
GTTIKKSGGK 
PELAIKLITA 
FKDAFSGRDS 
YVLTNVAYFT 
LFYVASREGH 
GIAVAGLIYL 
VPAYYLFIIW 



51 

I 

SIIIGTIIGA 
YTYILEVFGP 
VGITWMVLN 
SITRLPLAFY 
TINAEELLLS 
LPEILSMIHV 
RYKCPDMHRP 
DKKPRWFRIM 



Seq ID NO: 35 Nucleotide sequence: 
Nucleic Acid Accession #: NM_002422 
Coding sequence: 64.. 1497 (underlined sequences c 



; and stop codons) 



I I I I I I 

ACAAGGAGGC AGGCAAGACA GCAAGGCATA GAGACAACAT AGAC-CTAAGT AAAGCCAGTG 
GAAATGAAGA GTCTTCCAAT CCTACTGTTG CTGTGCGTGG CAGTTTGCTC AGCCTATCCA 
TTGGATGGAG CTGCAAGGGG TGAGGACACC AGCATGAACC TTGTTCAGAA ATATCTAGAA 
AACTACTACG ACCTCAAAAA AGATGTGAAA CAGTTTGTTA GGAGAAAGGA CAGTGGTCCT 
GTTGTTAAAA AAATCCGAGA AATGCAGAAG TTCCTTGGAT TGGAGGTGAC GGGGAAGCTG 
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GACTCCGACA CTCTGGAGGT GATGCGCAAG CCCAGGTGTG GAGTTCCTGA TGTTGGTCAC 360 

TTCAGAACCT TTCCTGGCAT CCCGAAGTGG AGGAAAACCC ACCTTACATA CAGGATTGTG 420 

AATTATACAC CAGATTTGCC AAAAGATGCT GTTGATTCTG CTGTTGAGAA AGCTCTGAAA 480 

GTCTGGGAAG AGGTGACTCC ACTCACATTC TCCAC-GCTGT ATGAAGGAGA GGCTGATATA 540 

5 ATGATCTCTT TTGCAGTTAG AGAACATGGA GACTTTTACC CTTTTGATGG ACCTGGAAAT 600 

GTTTTGGCCC ATGCCTATGC CCCTGGGCCA GGGATTAATG GAGATGCCCA CTTTGATGAT 660 

GATGAACAAT GGACAAAGGA TACAACAGGG ACCAATTTAT TTCTCGTTGC TGCTCATGAA 720 

ATTGGCCACT CCCTGGGTCT CTTTCACTCA G C C AACACTG AAGCTTTGAT GTACCCACTC 780 

TATCACTCAC TCACAGACCT GACTCGGTTC CGCCTGTCTC AAGATGATAT AAATGGCATT 840 

10 CAGTCCCTCT ATGGACCTCC CCCTGACTCC CCTGAGACCC CCCTGGTACC CACGGAACCT 900 

GTCCCTCCAG AACCTGGGAC GCCAGCCAAC TGTGATCCTG CTTTGTCCTT TGATGCTGTC 960 

AGCACTCTGA GGGGAGAAAT CCTGATCTTT AAAGACAGGC ACTTTTG3CG CAAATCCCTC 1020 

AGGAAGCTTG AACCTGAATT GCATTTGATC TCTTCATTTT GGCCATCTCT TCCTTCAGGC 10 80 

GTGGATGCCG CATATGAAGT TACTAGCAAG GACCTCGTTT TCATTTTTAA AGGAAATCAA 114 0 

15 TTCTGGGCCA TCAGAGGAAA TGAGGTACGA GCTGGATACC CAAGAGGCAT CCACACCCTA 12 00 

GGTTTCCCTC CAACCGTGAG GAAAAT CG AT GCAGCCATTT CTGATAAGGA AAAGAACAAA 1260 

ACATATTTCT TTGTAGAGGA CAAATACTGG AGATTTGATG AGAAGAGAAA TTCCATGGAG 1320 

CCAGGCTTTC CCAAGCAAAT AGCTGAAGAC TTTCCAGGGA TTGACTCAAA GATTGATGCT 1380 

GTTTTTGAAG AATTTGGGTT CTTTTATTTC TTTACTGGAT CTTCACAGTT GGAGTTTGAC 1440 

20 CCAAATGCAA AGAAAGTGAC ACACACTTTG AAGAGTAACA GCTGGCTTAA TTGTTGAAAG 1500 

AGATATGTAG AAGGCACAAT ATGGGCACTT TAAATGAAGC TAATAATTCT TCACCTAAGT 1560 

CTCTGTGAAT TGAAATGTTC GTTTTCTCCT GCCTGTGCTG TGACTCGAGT CACACTCAAG 1620 

GGAACTTGAG CGTGAATCTG TATCTTGCCG GTCATTTTTA TGTTATTACA GGGCATTCAA 1680 

ATGGGCTGCT GCTTAGCTTG CACCTTGTCA CATAGAGTGA TCTTTCCCAA GAGAAGGGGA 1740 

25 AGCACTCGTG TGCAACAGAC AAGTGACTGT ATCTGTGTAG ACTATTTGCT TATTTAATAA 1800 

AGACGATTTG TCAGTTGTTT T 



Seq ID NO: 36 Protein sequence: 
30 Protein Accession #: NP_002413 



1 11 21 31 41 51 

I I I I I I 

35 MKSLPILLLL CVAVCSAYPL DGAARGEDTS MNLVQKYLEN YYDLKKDVKQ FVRRKDSGPV 6 0 

VKKIREMQKF LGLEVTGKLD SDTLEVMRKP RCGVPDVGHF RTFPGIPKWR KTKLTYRIVN 120 

YTPDLPKDAV DSAVEKALKV WEEVTPLTFS RLYEGEADIM ISFAVREHGD FYPFDGPGNV 180 

LAHAYAPGPG INGDAHFDDD EQWTKDTTGT NLFLVAAHEI GHSLGLFHSA NTBALMYPLY 24 0 

HSLTDLTRFR LSQDDINGIQ SLYGPPPDSP ETPLVPTEPV PPEPGTPANC DPALSFDAVS 300 

40 TLRGEILIFK DRHFWRKS LR KLEPELHLIS SFWPSLPSGV DAAYEVTSKD LVFIFKGNQF 360 

WAIRGNEVRA GYPRGIHTLG FPPTVRKIDA AISDKEKNKT YFFVEDKYWR FDEKRNSMEP 420 

GFPKQIAEDF PGIDSKIDAV FEEFGFFYFF TGSSQLEFDP NAKKVTHTLK SNSWLNC 



45 Seq ID NO: 37 Nucleotide sequence: 
Nucleic Acid Accession #: NM_003246 

Coding sequence: 112.. 3624 (underlined sequences correspond to start and stop codons) 



50 1 11 21 31 41 51 

I I I 1 I I 

GGACGCACAG GCATTCCCCG CGCCCCTCCA GCCCTCGCCG CCCTCGCCAC CGCTCCCGGC 60 

CGCCGCGCTC CGGTACACAC AGGATCCCTG CTGGGCACCA ACAGCTCCAC CATGGGGCTG 120 

GCCTGGGGAC TAGGCGTCCT GTTCCTGATG CATGTGTGTG GCACCAACCG CATTCCAGAG 180 

55 TCTGGCGGAG ACAACAGCGT GTTTGACATC TTTGAACTCA CCGGGGCCGC CCGCAAGGGG 240 

TCTGGGCGCC GACTGGTGAA GGGCCCCGAC CCTTCCAGCC CAGCTTTCCG CATCGAGGAT 3 00 

GCCAACCTGA TCCCCCCTGT GCCTGATGAC AAGTTCCAAG ACCTGGTGGA TGCTGTGCGG 3 60 

GCAGAAAAGG GTTTCCTCCT TCTGGCATCC CTGAGGCAGA TGAAGAAGAC CCGGGGCACG 420 

CTGCTGGCCC TGGAGCGGAA AGACCACTCT GGCCAGGTCT TCAGCGTGGT GTCCAATGGC 4 80 

60 AAGGCGGGCA CCCTGGACCT CAGCCTGACC GTCCAAGGAA AGCAGCACGT GGTGTCTGTG 540 

GAAGAAGCTC TCCTGGCAAC CGGCCAGTGG AAGAGCATCA CCCTGTTTGT GCAGGAAGAC 600 

AGGGCCCAGC TGTACATCGA CTGTGAAAAG ATGGAGAATG CTGAGTTGGA CGTCCCCATC 660 

CAAAGCGTCT TCACCAGAGA CCTGGCCAGC ATCGCCAGAC TCCGCATCGC AAAGGGGGGC 720 

GTCAATGACA ATTTCCAGGG GGTGCTGCAG AATGTGAGGT TTGTCTTTGG AACCACACCA 780 

65 GAAGACATCC TCAGGAACAA AGGCTGCTCC AGCTCTACCA GTGTCCTCCT CACCCTTGAC 840 

AACAACGTGG TGAATGGTTC CAGCCCTGCC ATCCGCACTA ACTACATTGG CCACAAGACA 900 

AAGGACTTGC AAGCCATCTG CGGCATCTCC TGTGATGAGC TGTCCAGCAT C-GTCCTGGAA 960 

CTCAGGGGCC TGCGCACCAT TGTGACCACG CTGCAGGACA GCATCCGCAA AGTGACTGAA 1020 

GAGAACAAAG AGTTGGCCAA TGAGCTGAGG CGGCCTCCCC TATGCTATCA CAACGGAGTT 1080 

70 CAGTACAGAA ATAACGAGGA ATGGACTGTT GATAGCTGCA CTGAGTGTCA CTGTCAGAAC 1140 

TCAGTTACCA TCTGCAAAAA GGTGTCCTGC CCCATCATGC CCTGCTCCAA TGCCACAGTT 12 00 

CCTGATGGAG AATGCTGTCC TCGCTGTTGG CCCAGCGACT CTGCGGACGA TGGCTGGTCT 1260 

CCATGGTCCG AGTGGACCTC CTGTTCTACG AGCTGTGGCA ATGGAATTCA GCAGCGCGGC 1320 

CGCTCCTGCG ATAGCCTCAA CAACCGATGT GAGGGCTCCT CGGTCCAGAC ACGGACCTGC 13 80 

75 CACATTCAGG AGTGTGACAA AAGATTTAAA CAGGATGGTG GCTGGAGCCA CTGGTCCCCG 1440 

TGGTCATCTT GTTCTGTGAC ATGTGGTGAT GGTGTGATCA CAAGGATCCG C-CTCTGCAAC 1500 
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TCTCCCAGCC CCCAGATGAA TGGGAAACCC TGTGAAGGCG AAGCGCGGGA GACCAAAGCC 1560 

TGCAAGAAAG ACGCCTGCCC CATCAATGGA GGCTGGGGTC CTTGGTCACC ATGGGACATC 162 0 

TGTTCTGTCA CCTGTGGAGG AGGGGTACAG AAACGTAGTC GTCTCTGCAA CAACCCCGCA 1680 

CCCCAGTTTG GAGGCAAGGA CTGCGTTGGT GATGTAACAG AAAAC C AG AT CTGCAACAAG 1740 

5 CAGGACTGTC CAATTGATGG ATGCCTGTCC AATCCCTGCT TTGCCGGCGT GAAGTGTACT 1800 

AGCTACCCTG ATGGCAGCTG GAAATGTGGT GCTTGTCCCC CTGGTTACAG TGGAAATGGC 1860 

ATCCAGTGCA CAGATGTTGA TGAGTGCAAA GAAGTGCCTG ATGCCTGCTT CAACCACAAT 192 0 

GGAGAGCACC GGTGTGAGAA CACGGACCCC GGCTACAACT GCCTGCCCTG CCCCCCACGC 1980 

TTCACCGGCT CACAGCCCTT CGGCCAGGGT GTCGAACATG CCACGGCCAA CAAACAGGTG 2040 

10 TGCAAGCCCC GTAACCCCTG CACGGATGGG ACCCACGACT GCAACAAGAA CGCCAAGTGC 2100 

AACTACCTGG GCCACTATAG CGACCCCATG TACCGCTGCG AGTGCAAGCC TGGCTACGCT 2160 

GGCAATGGCA TCATCTGCGG GGAGGACACA GACCTGGATG GCTGC-CCCAA TGAGAACCTG 222 0 

GTGTGCGTGG CCAATGCGAC TTACCACTGC AAAAAGGATA ATTGCCCCAA CCTTCCCAAC 228 0 

TCAGGGCAGG AAGACTATGA CAAGGATGGA ATTGGTGATG CCTGTGATGA TGACGATGAC 234 0 

15 AATGATAAAA TTCCAGATGA CAGGGACAAC TGTCCATTCG ATTACAACCC AGCTCAGTAT 2400 

GACTATGACA GAGATGATGT GGGAGACCGC TGTGACAACT GTCCCTACAA CCACAACCCA 2460 

GATCAGGCAG ACACAGACAA CAATGGGGAA GGAGACGCCT GTGCTGCAGA CATTGATGGA 2520 

GACGGTATCC TCAATGAACG GGACAACTGC CAGTACGTCT ACAATGTGGA CCAGAGAGAC 2580 

ACTGATATGG ATGGGGTTGG AGATCAGTGT GACAATTGCC CCTTGGAACA CAATCCGGAT 264 0 

20 CAGCTGGACT CTGACTCAGA CCGCATTGGA GATACCTGTG ACAACAATCA GGATATTGAT 2700 

GAAGATGGCC ACCAGAACAA TCTGGACAAC TGTCGCTATG TGCCCAATGC CAACCAGGCT 2760 

GACCATGACA AAGATGGCAA ' GGGAGATGCC TGTGACCACG ATGATGACAA CGATGGCATT 2820 

CCTGATGACA AGGACAACTG CAGACTCGTG CCCAATCCCG ACCAGAAGGA CTCTGACGGC 2880 

GATGGTCGAG GTGATGCCTG CAAAGATGAT TTTGACCATG ACAGTGTGCC AGACATCGAT 2940 

25 GACATCTGTC CTGAGAATGT TGACATCAGT GAGACCGATT TCCGCCGATT CCAGATGATT 3000 

CCTCTGGACC CCAAAGGGAC ATCCCAAAAT GACCCTAACT GGGTTGTACG CCATCAGGGT 3060 

AAAGAACTCG TCCAGACTGT CAACTGTGAT CCTGGACTCG CTGTAGGTTA TGATGAGTTT 312 0 

AATGCTGTGG ACTTCAGTGG CACCTTCTTC ATCAACACCG AAAGGGACGA TGACTATGCT 3180 

GGATTTGTCT TTGGCTACCA GTCCAGCAGC CGCTTTTATG TTGTGATGTG GAAGCAAGTC 324 0 

30 ACCCAGTCCT ACTGGGACAC CAACCCCACG AGGGCTCAGG GATACTCGGG CCTTTCTGTG 3300 

AAAGTTGTAA ACTCCACCAC AGGGCCTGGC GAGCACCTGC GGAACGCCCT GTGGCACACA 3360 

GGAAACACCC CTGGCCAGGT GCGCACCCTG TGGCATGACC CTCGTCACAT AGGCTGGAAA 342 0 

GATTTCACCG CCTACAGATG GCGTCTCAGC CACAGGCCAA AGACGGGTTT CATTAGAGTG 34 80 

GTGATGTATG AAGGGAAGAA AATCATGGCT GACTCAGGAC CCATCTATGA TAAAACCTAT 3540 

35 GCTGGTGGTA GACTAGGGTT GTTTGTCTTC TCTCAAGAAA TGGTGTTCTT CTCTGACCTG 3600 

AAATACGAAT GTAGAGATCC CTAATCATCA AATTGTTGAT TGAAAGACTG ATCATAAACC 3660 

AATGCTGGIA TTGCACCTTC TGGAACTATG GGCTTGAGAA AACCCCCAGG ATCACTTCTC 3720 

CTTGGCTTCC TTCTTTTCTG TGCTTGCATC AGTGTGGACT CCTAGAACGT GCGACCTGCC 3780 

TCAAGAAAAT GCAGTTTTCA AAAACAGACT CATCAGCATT CAGCCTCCAA TGAATAAGAC 3840 

40 ATCTTCCAAG CATATAAACA ATTGCTTTGG TTTCCTTTTG AAAAAGCATC TACTTGCTTC 3500 

AGTTGGGAAG GTGCCCATTC CACTCTGCCT TTGTCACAGA GCAGGGTGCT ATTGTGAGGC 3960 

CATCTCTGAG CAGTGGACTC AAAAGCATTT TCAGGCATGT CAGAGAAGGG AGGACTCACT 402 0 

AGAATTAGCA AACAAAACCA CCCTGACATC CTCCTTCAGG AACACGGGGA GCAGAGGCCA 4080 

AAGCACTAAG GGGAGGGCGC ATACCCGAGA CGATTGTATG AAGAAAATAT GGAGGAACTG 414 0 

45 TTACATGTTC GGTACTAAGT CATTTTCAGG GGATTGAAAG ACTATTGCTG GATTTCATGA 42 00 

TGCTGACTGG CGTTAGCTGA TTAACCCATG TAAATAGGCA CTTAAATAGA AGCAGGAAAG 4260 

GGAGACAAAG ACTGGCTTCT GGACTTCCTC CCTGATCCCC ACCCTTACTC ATCACCTTGC 432 0 

AGTGGCCAGA ATTAGGGAAT CAGAAT CAAA CCAGTGTAAG GCAGTGCTGG CTGCCATTGC 43 80 

CTGGTCACAT TGAAATTGGT GGCTTCATTC TAGATGTAGC T TGTGCAG AT GTAGCAGGAA 444 0 

50 AATAGGAAAA CCTACCATCT CAGTGAGCAC CAGCTGCCTC CCAAAGGAGG GGCAGCCGTG 45 00 

CTTATATTTT TATGGTTACA ATGGCACAAA ATTATTATCA ACCTAACTAA AACATTCCTT 456 0 

TTCTCTTTTT TCCGTAATTA CTAGGTAGTT TTCTAATTCT CTCTTTTGGA AGTATGATTT 462 0 

TTTTAAAGTC TTTACGATGT AAAATATTTA TTTTTTACTT ATTCTGGAAG ATCTGGCTGA 4680 

AGGATTATTC ATGGAACAGG AAGAAGCGTA AAGACTATCC ATGTCATCTT TGTTGAGAGT 4740 

55 CTTCGTGACT GTAAGATTGT AAATACAGAT TATTTATTAA CTCTGTTCTG CCTGGAAATT 4800 

TAGGCTTCAT ACGGAAAGTG TTTGAGAGCA AGTAGTTGAC ATTTATCAGC AAATCTCTTG 4860 

CAAGAACAGC ACAAGGAAAA TCAGTCTAAT AAGCTGCTCT GCCCCTTGTG CTCAGAGTGG 4920 

ATGTTATGGG ATTCCTTTTT TCTCTGTTTT ATCTTTTCAA GTGGAATTAG TTGGTTATCC 4980 

ATTTGCAAAT GTTTTAAATT GCAAAGAAAG CCATGAGGTC TTCAATACTG TTTTACCCCA 5040 

60 TCCCTTGTGC ATATTTCCAG GGAGAAGGAA AGCATATACA CTTTTTTCTT TCATTTTTCC 5100 

AAAAGAGAAA AAAATGACAA AAGGTGAAAC TTACATACAA ATATTACCTC ATTTGTTGTG 5160 

TGACTGAGTA AAGAATTTTT GGATCAAGCG GAAAGAGTTT AAGTGTCTAA CAAACTTAAA 522 0 

GCTACTGTAG TACCTAAAAA GTCAGTGTTG TACATAGCAT AAAAACTCTG CAGAGAAGTA 5280 

TTCCCAATAA GGAAATAGCA TTGAAATGTT AAATACAATI TCTGAAAGTT ATGTTTTTTT 5340 

65 TCTATCATCT GGTATACCAT TGCTTTATTT TTATAAATTA TTTTCTCATT GCCATTGGAA 5400 

TAGAATATTC AGATTGTGTA GATATGCTAT TTAAATAATT TATCAGGAAA TACTGCCTGT 5460 

AGAGTTAGTA TTTCTATTTT TATATAATGT TTGCACACTG AATTGAAGAA TTGTTGGTTT 552 0 

TTTCTTTTTT TTGTTTTTTT TTTTTTTTTT TTTTTTTTTG CTTTTGACCT CCCATTTTTA 5580 

CTATTTGCCA ATACCTTTTT CTAGGAATGT GCTTTTTTTT GTACACATTT TTATCCATTT 5640 

70 TACATTCTAA AGCAGTGTAA GTTGTATATT ACTGTTTCTT ATGTACAAGG AACAACAATA 570 0 
AATCATATGG AAATTTATAT TT 

Seq ID NO: 38 Protein sequence: 
Protein Accession #: NP_003237 
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MGLAWGLGVL FLMHVCGTNR IPESGGDNSV FDIFELTGAA RKGSGRRLVK GPDPSSPAFR 60 

IEDANLIPPV PDDKFQDLVD AVRAEKGFLL LASLRQMKKT RGTLLALERK DHSGQVFSW 12 0 

SNGKAGTLDL SLTVQGKQHV VSVEEALLAT GQWKSITLFV QEDRAQLYID CEKMENAEIjD 180 

VPIQSVFTRD LASIARLRIA KGGVNDNFQG VLQNVRFVFG TTPEDILRNK GCSSSTSVLL 240 

TLDNNWNGS SPAIRTNYIG HKTKDLQAIC GISCDELSSM VLELRGLRTI VTTLQDSIRK 300 

VTEENKELAN ELRRPPLCYH NGVQYRNNEE WTVDSCTECH CQNSVTICKK VSCPIMPCSN 360 

ATVPDGECCP RCWPSDSADD GWSPWSEWTS CSTSCGNGIQ QRGRSCDSLN NRCEGSSVQT 420 

RTCHIQECDK RFKQDGGWSH WSPWSSCSVT CGDGVITRIR LCNSPSPQMN GKPCEGEARE 4 80 

TKACKKDACP INGGWGPWSP WDICSVTCGG GVQKRSRLCN NPAPQFGGKD CVGDVTENQI 540 

CNKQDCPIDG CLSNPCFAGV KCTSYPDGSW KCGACPPGYS GMGIQCTDVD ECKEVPDACF SOO 

NHNGEHRCEN TDPGYNCLPC PPRFTGSQPF GQGVEHATAN KQVCKPRNPC TDGTHDCNKN 660 

AKCNYLGHYS DPMYRCECKP GYAGNGIICG EDTDLDGWPN ENLVCVANAT YHCKKDNCPN 720 

LPNSGQEDYD KDGIGDACDD DDDNDKI PDD RDNCPFHYNP AQYDYDRDDV GDRCDNCPYN 780 

HNPDQADTDN NGEGDACAAD IDGDGILNER DNCQYVYNVD QRDTDMDGVG DQCDKCPLEH 840 

NPDQLDSDSD RIGDTCDNNQ DIDEDGHQNN LDNCPYVPNA NQADKDKDGK GDACDHDDDN 900 

DGIPDDKDNC RLVPNPDQKD SDGDGRGDAC KDDFDHDSVP DIDDICPENV DISETDFRRF 9S0 

QMIPLDPKGT SQNDPNWWR HQGKELVQTV NCDPGLAVGY DEFNAVDFSG TFFINTERDD 1020 

DYAGFVFGYQ SSSRFYWMW KQVTQSYWDT NPTRAQGYSG LSVKWNSTT GPGEHLRNAL 1080 

WHTGNTPGQV RTLWHDPRHI GWKDFTAYRW RLSHRPKTGF IRWMYEGKK IMADSGPIYD 1140 
KTYAGGRLGIi FVFSQEMVFF SDLKYECRDP 



Seq ID NO: 39 Nucleotide s 



CCCGACCCGT GCGAGGGCCA GGTCCGCGCC 



CCCTGGACGC 
ACAAGGGCTC 
ACGAGAGGAA 
TGCTGGGAAA 
CGGAGCGGCT 
GGAAGAAGCA 
TCTCCCGGGA 
AGAAGGAGGA 
ACCACGAGGG 
CGTACGGGCT 
CCTTCTTCTC 
CAGGGCACCC 
GCTCCCTGGC 
CCCCATCTCC 
CCCACCTGGG 
TGAGCCAGGT 
CTCCTGGCCA 



CGGCCACGTA 
CCCTCGCGCC 
TTTCCTCCCA 
GCTGGACTCT 
GCCCACATTT 
CTCCCAGTGG 
GACAGACTTG 
ATTAATAAAG 
TGATAATTTT 
TCCAAAGTGA 

TATTTTATTT 
ACACACACTT 
TATAAAATTC 
GATGTTTAAA 



GGCTTCGCTG 
CGAGCTGTCG 
CGAGAGCCGT 
ACGGCTGGCA 
GTCGTGGAAG 
GCGCCTGCAG 
GGCCAAGCGG 
CCAGAACGCC 
CAGGGGTGAG 
GCCGGCTGGT 
GCCCACACCT 
CTCCCCCTGC 
GTACTCACCG 
CCTTGGCCAG 
TGCCTATTAC 
CCAGCTTTCC 
GGAACTCCTG 
CCCAGACTCC 
ACCAACGGGT 
CTACAACAGC 
CTCTCCTTCT 
CCGCTCAGGG 
CCTTATCCGA 
TAAGTATATT 
AATGTTCACT 
ATAGCCAAGG 
GAAGATGGGG 
GTGTGCACAG 
CCACAAAATT 
ATTAACCAGT 
TAAATATACA 
CAAGAGCCAC 
AGTGTATTAG 
AACAAAACAG 



CTGGGAGCCT 
GATGGACAAT 
ATCCGGCGGC 
GTGCAGAACC 
GCGCTGACGC 
CACATGCAGG 
CTGTGCAAGC 
CTGCCGGAGA 
TACTCCCCCG 
GGTGGCGGCG 
CCTGAAATGT 
CAGGAGGAGC 
GAGTACGCCC 
TCCCCCGGCG 
TCCCCGGCCA 



GCCACAGGGG 
CCCACAGAGA 
TACAGTGTGT 
TGTGCCTTGA 
CAGGGAGGTC 
GTGCCGCCTC 
CCTTCAAGTG 
GACGTCTTTT 
TCCCTTCTGG 
AAATTTGACT 
CCCAAGGACC 
TCAAAGGGAC 
ATGGCTAACT 
TTTTAAAGCA 
CGCGCCCAGC 
TTTCATTACA 
GCTGTTGTAA 



TGCCCCGCCA 
ACCCTTGGCC 
CGCCGCCGGC 
CCATGAACGC 
CGGACCTGCA 
TGTCCCAGAA 
ACTACCCCAA 
GCGTGGACCC 
AGAC-AAGCGG 
GCACTGCCCT 
GCGGCACCCC 
CTCCCCTGGA 
ATGGCCATCC 
CAAGCCCTCT 
TCTCCATGAT 
CCTACCACCC 
AGCACCCTGG 
ATCGCAATGA 
CCATGGCCCT 
CCAGCCTCAT 
CATAGAGCTG 
GTGGCAGAGG 



GAGGCCGTAC 



CGAGGGTCTC GAGTGCCCGG 120 

CGTCCCCCGG CCCCCGGGGG 180 

CTTCATGGTT TGGGCCAAGG 240 

CTCAGCAAGA 300 

GTGGACGAGG 360 

CGGCCGCGCA 420 

CTGAGCXCCC 4 80 

GCGCTGGGGG 540 

CGGGGCTGCT 600 

GACACGTACC 660 

CCGGAGCAGA 720 

CCCCACCTGC 780 

CACCCCCTGG 840 

CCCGGCTGTC 900 

AACCTCCAAG 960 
CTGGATCAAC 
TATTTGAACA 
GTTCCGGTCT 
GCTGATGCCA 



GGGCTTCCTT 
CAGCCGGGGG 
GCCCAGCCTC 
GAGCAGTGTG 
CGTGCTGGAG 
CCGCCGCATC 
CCACTGTAGC 
GTCCCCTGTA 
ACTCCACTCC 



TATCCCCTTC 
AGTTTTCCTC 
CTTGGTAGCC 
TCCAGTTTTC 
CATTAATGAG 
ACGAGGCTTT 
TCATACAATT 
ATATCACAGA 
GTT! 



ATTCGACCAG 
CAGTGGGCAT 
CTCCGTCCTG 
GAGGCGCCCC 
AGCCGTCCAG 
CCCCAGAGCC 
CCCACGTTCC 



ATCATCGAAA 
TGATTTAGGG 
CTCC-CTAACC 
CTGCACTTTC 
TGAGAAAAAA 
AAATGGGATT 
TTTGTTAATT 
AATTTTCATT 
TATATTTCTA 
AAAAAAAAA 



CCACACCAGC 
TTTGGCCTAA 
AGCCCCTGCA 
GAGTTGCTGT 
CTAATGGGGG 
TTCTCTCAAG 
TACGATCTGG 
TGCACCCCCT 
CAGTCAACCT 
GAGTTAAAAC 
TGTTTATTAT 
CTCTTTTACC 
AACATTTTAT 



1200 
1260 
1320 



I I I I I I 

MASLLGAYPW PEGLECPAIiD AELSDGQSPP AVPRPPGDKG SESRIRRPMN AFMVWAKDER 

KRDAVQNPDL HNAELSKMLG KSWKALTLSQ KRPYVDEAER LRLQHMQDYP NYKYRPRRKK 

QAKRLCKRVD PGFLLSSLSR DQNALPEKRS GSRGALGEKE DRGEYSPGTA LPSLRGCYHE 
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GPAGGGGGGT PSSVDTYPYG LPTPPEMSPL 
PYSPEYAPSP LHCSHPLGSL ALGQSPGVSM 
GQLSPPPEHP GFDALDQLSQ VELLGDMDRN 
TPTGPTETSL ISVLADATAT YYNSYSVS 



DVLEPEQTFF SSPCQEEHGH PRRIPHLPGH 240 
MSPVPGCPPS PAYYSPATYH PLHSNLQAHL 300 
EFDQYLNTPG KPDSATGAMA LSGHVPVSQV 360 



Seg ID NO: 41 Nucleotide sequence: 
Nucleic Acid Accession #: NM_004449 

Coding sequence: 1..1389 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I 1 I I I 

ATGATTCAGA CTGTCCCGGA CCCAGCAGCT CATATCAAGG AAGCCTTATC AGTTGTGAGT 60 

GAGGACCAGT CGTTGTTTGA GTGTGCCTAC GGAACGCCAC ACCTGGCTAA GACAGAGATG 12 0 

ACCGCGTCCT CCTCCAGCGA CTATGGACAG ACTTCCAAGA TGAGCCCACG CGTCCCTCAG 18 0 

CAGGATTGGC TGTCTCAACC CCCAGCCAGG GTCACCATCA AAATGGAATG TAACCCTAGC 24 0 

CAGGTGAATG GCTCAAGGAA CTCTCCTGAT GAATGCAGTG TGGCCAAAGG CGGGAAGATG 3 00 

GTGGGCAGCC CAGACACCGT TGGGATGAAC TACGGCAGCT ACATGGAGGA GAAGCACATG 360 

CCACCCCCAA ACATGACCAC GAACGAGCGC AGAGTTATCG TGCCAGCAGA TCCTACGCTA 42 0 

TGGAGTACAG ACCATGTGCG GCAGTGGCTG GAGTGGGCGG TGAAAGAATA TGGCCTTCCA 4 80 

GACGTCAACA TCTTGTTATT CCAGAACATC GATGGGAAGG AACTGTGCAA GATGACCAAG 54 0 

GACGACTTCC AGAGGCTCAC CCCCAGCTAC AACGCCGACA TCCTTCTCTC ACATCTCCAC 600 

TACCTCAGAG AGACTCCTCT TCCACATTTG ACTTCAGATG ATGTTGATAA AGCCTTACAA 660 

AACTCTCCAC GGTTAATGCA TGCTAGAAAC ACAGATTTAC CATATGAGCC CCCCAGGAGA 720 

TCAGCCTGGA CCGGTCACGG CCACCCCACG CCCCAGTCGA AAGCTGCTCA ACCATCTCCT 780 

TCCACAGTGC CCAAAACTGA AGACCAGCGT CCTCAGTTAG ATCCTTATCA GATTCTTGGA 84 0 

CCAACAAGTA GCCGCCTTGC AAATCCAGGC AGTGGCCAGA TCCAGCTTTG GCAGTTCCTC 900 

CTGGAGCTCC TGTCGGACAG CTCCAACTCC AGCTGCATCA CCTGGGAAGG CACCAACGGG 960 

GAGTTCAAGA TGACGGATCC CGACGAGGTG GCCCGGCGCT GGGGAGAGCG GAAGAGCAAA 1020 

CCCAACATGA AGTACGATAA GCTCAGCCGC GCCCTCCGTT ACTACTATGA CAAGAACATC 1080 

ATGACCAAGG TCCATGGGAA GCGCTACGCC TACAAGTTCG ACTTCCACGG GATCGCCCAG 114 0 

GCCCTCCAGC CCCACCCCCC GGAGTCATCT CTGTACAAGT ACCCCTCAGA CCTCCCGTAC 12 00 

ATGGGCTCCT ATCACGCCCA CCCACAGAAG ATGAACTTTG TGGCGCCCCA CCCTCCAGCC 1260 

CTCCCCGTGA CATCTTCCAG TTTTTTTGCT GCCCCAAACC CATACTGGAA TTCACCAACT 13 2 0 

GGGGGTATAT ACCCCAACAC TAGGCTCCCC ACCAGCCATA TGCCTTCTCA TCTGGGCACT 1380 
TACTACTAA 



Seq ID NO: 42 Protein sequence: 
Protein Accession #: NP_004440 



1 11 21 31 41 51 

I I I I I I 

MIQTVPDPAA HIKEALSWS EDQSLFECAY GTPHLAKTEM TASSSSDYGQ TSKMSPRVPQ 60 

QDWLSQPPAR VTIKMECNPS QVNGSRNSPD ECSVAKGGKM VGSPDTVGMN YGSYMEEKHM 120 

PPPNMTTNER RVIVPADPTL WSTDHVRQWL EWAVKEYGLP DVNILLFQNI DGKELCKMTK 180 

DDFQRLTPSY NADILLSHLH YLRETPLPHL TSDDVDKALQ NSPRLMHARN TDLPYE P PRR 240 

SAWTGHGHPT PQSKAAQPSP STVPKTEDQR PQLDPYQILG PTSSRLANPG SGQIQLWQFL 300 

LELLSDSSNS SCITWEGTNG EFKMTDPDEV ARRWGERKSK PNMNYDKLSR ALRYYYDKNI 360 

MTKVHGKRYA YKFDFHGIAQ ALQPHPPESS LYKYPSDLPY MGSYHAHPQK MNFVAPHPPA 42 0 
LPVTSSSFFA APNPYWNSPT GGIYPNTRLP TSHMPSHLGT YY 



Seq ID NO: 43 Nucleotide sequence: 
Nucleic Acid Accession #: NM_005100 

Coding sequence: 192.. 5537 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CCTTCTTTTA AGGAGTTTGC CGCGAGCGCG TCTCCTTCAT TCGCAGGCTG GGCGCGTTCG 60 

CAGTCGGCTG GCGGCGAAGG AAGGCGCTCT CGGGACCTCA CGGGCC-CGCG TCTTTTGGCT 120 

CTTGCCCCTG TCCCTGCGGC TTGGGGAAAG CGTAACCCGG CGGCTAGGCG CGGGAGAAGT 180 

GCGGAGGAGC CATGGGCGCC GGGAGCTCCA CCGAGCAGCG CAGCCCGGAG CAGCCGCCCG 240 

AGGGGAGCTC CACGCCGGCT GAGCCCGAGC CCAGCGGCGG CGGCCCCTCG GCCGAGGCGG 300 

CGCCAGACAC CACCGCGGAC CCCGCCATCG CTGCCTCGGA CCCCGCCACC AAGCTCCTAC 360 

AGAAGAATGG TCAGCTGTCC ACCATCAATG GCGTAGCTGA GCAAGATGAG CTCAGCCTCC 42 0 

AGGAGGGTGA CCTAAATGGC CAGAAAGGAG CCCTGAACGG TCAAGGAGCC CTAAACAGCC 4 80 

AGGAGGAAGA AGAAGTCATT GTCACGGAGG TTGGACAGAG AGACTCTGAA GATGTGAGCG 540 

AAAGAGACTC CGATAAAGAG ATGGCTACTA AGTCAGCGGT TGTTCACGAC ATCACAGATG 600 

ATGGGCAGGA GGAGAACCGA AATATCGAAC AGATTCCTTC TTCAGAAAGC AATTTAGAAG 660 

AGCTAACACA ACCCACTGAG TCCCAGGCTA ATGATATTGG ATTTAAGAAG GTGTTTAAGT 720 

TTGTTGGCTT TAAATTCACT GTGAAAAAGG ATAAGACAGA GAAGCCTGAC ACTGTCCAGC 780 
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TACTCACTGT GAAGAAAGAT GAAGGGGAGG 
CCAGCCTTGG GGCTGGAGAA GCAGCATCCA 
AACCCGAAGA GACCCTGAAG CGTGAGCAAA 
CTGGCCAAGC AGTGGAGGAA TGCAAAGAGG 
5 GCAAGTCTGC AGAATCTCCG ACTAGTCCCG 
AATTCTTCAC TCAAGGTTGG GCCGGCTGGC 
AGGATGAAGT GGAAGCTTCA GAGAAGAAAA 
AAGAAGACGG AAAGGCAGAG GTTGCCTCCG 
CACAGGAGCC GGCAGAAAGT GCCCACGAGC 

10 AGCTGCCCTC AGAGGAGCAA GTCAGTGGCT 
CGTTGGCGAC AGAAGTGTTT GATGAGAAAA 
AAGTCCACGT CAGCACCGTG GAGGAGAGAA 
CAGCAGGGTC TGTGCCAGCT GAAGAATTGG 
AACCTGCCAA GGAGCTGGTG AAGCTCAAAG 

15 AGGGAGCTGA CCTCAGTCCT GATGAGAAGG 
GTGAGGTGGA AATGCTGTCA TCACAGGAGA 
AGCTTTTTAC CAGCACTGGC TTAAAAAAGC 
GAGGAGGAGA CGAGGAATCA GGGGAGCACA 
AGGAGGAGCA AAAGGGCGAG AGCTCTGCCT 

20 GTCTGGAAAA GGGCTTAGCC GAGGTGCAGC 
CCGATGGAGA GAAAAAAAGA GAAGGTGTCA 
CGCCCAAGAA GCGTGTTAGA CGGCCTTCGG 
TCAAGAGCGC TACCTTGTCT TCCACCGAGA 
AAGGGAGCGT GGAAGAGCCA AAGCCGGAAG 

25 . CTTGGGAAGC TTTAATTTGT GTGGGATCAT 
CTGATGAGGA AGGGGGACCA AAAGCAATGG 
GAAAAGACAA AGAGACGGGG ACAGACGGGA 
GGCAGGGAAG TTCCTCCCCG GAGCAAGCTG 
CCTGGGAGTC ATTTAAAAGG TTAGTCACGC 

30 AGAAAAGCGA AGACTCCATA GCTGGGTCTG 
CCGGTAAAGA AGAATCCTGG GTCTCAATCA 
GGCCAGATGG GAAACAAGAA CAAGCCCCTG 
AAGATGACTC TGATGTCCCG GCCGTGGTCC 
AGAAAATGGA GGCACAGCAA GCCCAAAAAG 

35 CTGAGGTGTC CAAGGAGCTC AGCGAGAGTC 
ACGGGACGAG GGCAGCTACC ATTATTGAAG 
TGACAGAACC TCTTGAACAA GTAGAAGCTG 
AAAGAGAAGT AATTGCAGAA GAAGAACCCC 
GAGAGGCCCG GGGCGACACG GTCGTTAGTG 

40 CTGCAGAAAC TGCAGGGCCA TTGGGTTCCG 
AGACCACAGA AATGGTGTCA GCAGTCTCCC 
AGGCCACTCC GGTGCAGGAG GTGGAAGGTG 
GGACTCAAGA GGTCCTCCAG GCAGTGGCAG 
GCACCGGTGG GCCAGAAGAT GTGCTTCAGC 

45 AAGAGCAGGC TGAAGCGTCG GGTCTGAAGA 
CTCAGGAGGC AAAAACTGAG CCTTTTACAC 
AAAGCTTTGA AAAAGCTCCT CAAGTCACAG 
CTTGTCAAGC CGAAACCTTA GCTGGGGTAA 
TCCCCCCTGA CTCGGTGGAA ACCCCTACAG 

50 CCGACTTTGA CGCACCAGGC ACAACCCAGA 
ATGAGGTCGC ATCTGGTACC CAGTCAGGGG 
AAGAGAGGCC TCCAGCACCT TCCAGTTTTG 
AGATGGAAGA CACTCTAGAG CATACAGATA 
TGTCAAAGAC TGAGGGGACT CAAGAGGCTG 

55 TACCATTTTT CGAAGGACTT GAGGGGTCTA 
AGGTCACTGA AGTTGCCCTT AAAGGTGAAG 
ATGCTCTTGA ACTGCAGAGT CACGCTAAGT 
TAGTTCAAGT CGAAAGGGAG AAAACAGAAG 
TTGAGCACGA AACAGCTGTT ACCGTATCTG 

60 TGAATGTGCC CAT CATAGAT GGGGCAAAGG 
CCTGCCTAGG T CAAGAGGAG GCAGTATGCA 
CATTCACTCT AACAGCGGCT GCAGAGGAGG 
TAGAAACAGG TGAAACGTTG GAGCCTGCAG 
CTGAAAAAAA TGAAGACTTT GCCGCTCATC 

65 ACTGTCAGGC AAAATCGACA CCAGTGATAG 
CCGACCTGGA AGGAGAGAAA ACCACATCAC 
AGGTTGCTTG CCAGGAGGTC AAAGTGAGTG 
GGATTTTGGA ACTTGAGACC AAAAGCAGTA 
TTGACCAGTT TGTACGTACA GAAGAAACAG 

70 CACAAGCTCA CGTGATAAAA GCTGACAGCC 
GAGAGGAACC TCAGGCCTCT GCACAGGATG 
CAGAGTCAAC CGCAGTGGGA CAAGCACATT 
CAGAAAAGAC CATGACTGTT GAGGTAGAAG 
AGGTCGTCCT CCCATCTGAG GAAGAGGGAG 

75 ATGATGGTCA TGCCTTGTTA GCAGAAAGAA 
ATGAAAAAGG TGATGATGTT GATGACCCTG 



GAGCAGCAGG GGCTGGCGAC CACCAGGACC 84 0 

AAGAAAGCGA ACCCAAACAA TCTACAGAGA 900 

GCCACGCAGA AATTTCTCCC CCAGCCGAAT 960 

AAGGAGAAGA GAAACAAGAA AAAGAACCTA 102 0 

TGACCAGTGA AACAGGATCA ACCTTCAAAA 1080 

GCAAAAAGAC CAGTTTCAGG AAGCCGAAGG 114 0 

AGGAACAAGA GCCAGAAAAA GTAGACACAG 1200 

AGAAACTGAC CGCCTCCGAG CAAGCCCACC 1260 

CCCGGTTATC AGCTQAATAT GAGAAAGTTG 1320 

CGCAGGGACC TTCTGAAGAG AAACCTGCTC 13 80 

TAGAAGTCCA CCAAGAAGAG GTTGTGGCCG 144 0 

CCGAAGAGCA GAAAACGGAG GTGGAAGAAA 150 0 

TTGGAATGGA TGCAGAACCT CAGGAAGCCG 1560 

AAACGTGTGT TTCCGGAGAG GACCCTACAC 162 0 

TGCTGTCCAA ACCCCCCGAA GGCGTTGTGA 1680 

GAATGAAGGT GCAGGGAAGT CCACTAAAGA 1740 

TTTCTGGAAA GAAACAGAAA GGGAAAAGAG 1800 

CTCAGGTTCC AGCCGATTCT CCGGACAGCC 1860 

CATCCCCTGA GGAGCCCGAG GAGATCACGT 1920 

AGGATGGGGA AGCTGAAGAA GGAGCTACTT 1980 

CTCCCTGGGC ATCATTCAAA AAGATGGTGA 2040 

AAAGTGATAA AGAAGATGAG CTGGACAAGG 2100 

GCACAGCCTC TGAAATGCAA GAAGAAATGA 2160 

AACCAAAGCG CAAGGTGGAT ACCTCAGTAT 222 0 

CCAAGAAAAG AGCAAGGAGA AGGTCCTCTT 22 8 0 

GAGGAGACCA CCAGAAAGCT GATGAGGCCG 234 0 

TCCTTGCTGG TTCCCAAGAA CATGATCCAG 24 00 

GAAGCCCTAC CGAAGGGGAG GGCGTTTCCA 2460 

CAAGAAAAAA ATCAAAGTCC AAGCTGGAAG 2520 

GTGTAGAACA TTCCACTCCA GACACTGAAC 2580 

AGAAGTTTAT TCCTGGACGA AGGAAGAAAA 2640 

TTGAAGACGC AGGGCCAACA GGGGCCAACG 2700 

CTCTGTCTGA GTATGATGCT GTAGAAAGGG 2760 

GCGCAGAGCA GCCCGAGCAG AAGGCAGCCA 2820 

AGGTTCATAT GATGGCAGCA GCTGTCGCTG 2880 

AAAGGTCTCC TTCTTGGATA TCTGCTTCAG 294 0 

AAGCCGCACT GTTAACTGAG GAGGTATTGG 3000 

CCACGGTTAC TGAACCTCTG CCAGAGAACA 3060 

AGGCGGAATT GACCCCCGAA GCTGTGACAG 3120 

AAGAAGGAAC CGAA3CATCT GCTGCTGAAG 3180 

AGTTAACCGA CTCCCCAGAC AC C AC AGAGG 3240 

GCGTACCTGA CATAGAAGAG CAAGAGAGGC 33 00 

AAAAAGTGAA AGAGGAATCC CAGCTGCCTG 3360 

CTGTGCAGAG AGCAGAGGCA GAAAGACCAG 3420 

AAGAGACGGA TGTAGTGTTG AAAGTAGATG 3480 

AAGGGAAGGT GGTGGGGCAG ACCACCCCAG 3540 

AGAGCATAGA GTCCAGTGAG CTTGTAACCA 360 0 

AATCACAGGA GATGGTGATG GAACAGGCTA 366 0 

ACAGTGAGAC TGATGGAAGC ACCCCCGTAG 372 0 

AAGACGAGAT TGTGGAAATC CATGAGGAGA 378 0 

GCACAGAAGC AGAGGCAGTT CCTGCACAGA 384 0 

TGTTCCAGGA AGAAACTAAA GAACAATCAA 3900 

AAGAGGTGTC AGTGGAAACT GTATCCATTC 396 0 

ACCAGTATGC TGATGAGAAA ACCAAAGACG 4020 

TAGACACAGG CATAACAGTC AGTCGGGAAA 4080 

GGACAGAAGA AGCTGAATGT AAAAAGGATG 414 0 

CTCCTCCATC CCCCGTGGAG AGAGAGATGG 4200 

CAGAGCCAAC CCATGTGAAT GAAGAGAAGC 4260 

AAGAGGTCAG TAAGCAGCTC CTCCAGACAG 4320 

AAGTCAGCAG TTTGGAAGGA AGCCCTCCTC 43 80 

CCAAAATTCA AGTTCAGAGC TCTGAGGCAT 4440 

AAAAGGTCTT AGGAGAAACT GCCAACATTT 4500 

GTGCACATTT AGTTCTGGAA GAGAAATCCT 4560 

CAGGGGAAGA TGCTGTGCCC ACAGGGCCCG 4620 

TATCTGCTAC TACCAAGAAA GGCTTAAGTT 468 0 

TGAAGTGGAA GTCAGATGAA GTCGATGAGC 474 0 

TAGCAATTGA GGATTTAGAG CCTGAAAATG 48 00 

AACTTGTCCA AAACATCATC CAGACAGCCG 4860 

CCACCGAAAT GTTGACGTCT GAGTTACAGA 4920 

AGGACGCTGG ACAGGAAACG GAGAAAGAAG 4980 

AAACACCAAT TACTTCAGCC AAAGAGGAGT 5040 

CTGATATTTC CAAAGACATG AGTGAAGCCT 510 0 

GTTCCACTGT AAATGATCAC- CAGCTGGAAG 5160 

GTGGAGCTGG AACAAAGTCT GTGCCAGAAG 5220 

TAGAGAAGTC ACTAC-TTGAA CCGAAAGAAG 5280 

AAAACCAGAA CTCAGCCCTG GCTGATACTG 5340 
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ATGCCTCAGG AGGCTTAACC AAAGAGTCCC CAGATACAAA TGGACCAAAA CAAAAAGAGA 5400 

AGGAGGATGC CCAGGAAGTA GAATTGCAGG AAGGAAAAGT GCACAGTGAA TCAGATAAAG 5460 

CGATCACACC CCAAGCACAG GAGGAGTTAC AGAAACAAGA GAGAGAATCT G C AAAGT C AG 5520 

AACTTACAGA ATCTTAAAAC ATCATGCAGT TAAACTCATT GTCTGTTTGG AAGACCAGAA 558 0 

5 TGTGAAGACA AGTAGTAGAA GAAAATGAAT GCTGCTGCTG AGACTGAAGA CCAGTATTTC 5S4 0 

AGAACTTTGA GAATTGGAGA GCAGGCACAT CAACTGATCT CATTTCTAGA GAGCCCCTGA 5700 

CAATCCTGAG GCTTCATCAG GAGCTAGAGC CATTTAACAT TTCCTCTTTC CAAGACCAAC 5760 

CTACAATTTT CCCTTGATAA CCATATAAAT TCTGATTTAA GGTCCTAAAT TCTTAACCTG 5820 

GAACTGGAGT TGGCAATACC TAGTTCTGCT TCTGAAACTG GAGTATCATT CTTTACATAT 5880 

10 TTATATGTAT GTTTTAAGTA GTCCTCCTGT ATCTATTGTA TATTTTTTTC TTAATGTTTA 5940 

AGGAAATGTG CAGGATACTA CATGCTTTTT GTATCACACA GTATATGATG GGGCATGTGC 6000 

CATAGTGCAG GCTTGGGGAG CTTTAAGCCT CAGTTATATA ACCCACAAAA AACAGAGCCT S060 

CCTAGATGTA ACATTCCTGA TCAAGGTACA ATTCTTTAAA ATTCACTAAT GATTGAGGTC 6120 

CATATTTAGT GGTACTCTGA AATTGGTCAC TTTCCTATTA CACGGAGTGT GCCAAAACTA 618 0 

15 AAAAGCATTT TGAAACATAC AGAATGTTCT ATTGTCATTG GGAAATTTTG CTTTCTAACC 624 0 

CAGTGGAGGT TAGAAAGAAG TTATATTCTG GTAGCAAATT AACTTTACAT CCTTTTTCCT 6300 

ACTTGTTATG GTTGTTTGGA CCGATAAGTG TGCTTAATCC TGAGGCAAAG TAGTGAATAT 6360 

GTTTTATATG TTATGAAGAA AAGAATTGTT GTAAGTTTTT GATTCrACTC TTATATGCTG 6420 

GACTGCATTC ACACATGGCA TGAAATAAGT CAGGTTCTTT ACAAATGGTA TTTTGATAGA 6480 

20 TACTGGATTG TGTTTGTGCC ATATTTGTGC CATTCCTTTA AGAACAATGT TGCAACACAT 6540 

TCATTTGGAT AAGTTGTGAT TTGACGACTG ATTTAAATAA AATATTTGCT TCACTTAAAA 660 0 
AAAAAAAA 

Seq ID NO: 44 Protein sequence : 
25 Protein Accession #: WP_005091 



1 11 21 31 41 51 

™ 1 1 1 1 1 1 

30 MGAGSSTEQR SPEQPPEGSS TPAEPEPSGG GPSAEAAPDT TADPAIAASD PATKLLQKNG 60 

QLSTINGVAE QDEIjSLQEGD LNGQKGALNG QGALNSQEEE EVIVTEVGQR DSEDVSERDS 120 

DKEMATKSAV VHDITDDGQE ENRNIEQIPS SESNLEELTQ PTESQANDIG FKKVFKFVGF 180 

KFTVKKDKTE KPDTVQLLTV KKDEGEGAAG AGDHQDPSLG AGEAASKESE PKQSTEKPEE 24 0 

TLKREQSHAE ISPPAESGQA VEECKEEGEE KQEKEPSKSA ESPTSPVTSE TGSTFKKFFT 300 

35 QGWAGWRKKT SFRKPKEDEV EASEKKKEQE PEKVDTEEDG KAEVASEKLT ASEQAHPQEP 360 

AESAHEPRLS AEYEKVELPS EEQVSGSQGP SEEKPAPLAT EVFDEKIEVH QEEWAEVHV 420 

STVEERTEEQ KTEVEETAGS VPAEELVGMD AEPQEAEPAK ELVKLKETCV SGEDPTQGAD 480 

LSPDEKVLSK PPEGWSEVE MLSSQERMKV QGSPLKKLFT STGLKKLSGK KQKGKRGGGD 540 

EESGEHTQVP ADSPDSQEEQ KGESSASSPE EPEEITCLEK GLAEVQQDGE AEEGATSDGE 600 

40 KKREGVTPWA SFKKMVTPKK RVRRPSESDK EDELDKVKSA TLSSTESTAS EMQEEMKGSV S60 

EEPKPEEPKR KVDTSVSWEA LICVGSSKKR ARRRSSSDEE GGPKAMGGDH QKADEAGKDK 720 

ETGTDGIliAG SQEHDPGQGS SSPEQAGSPT EGEGVSTWES FKRLVTPRKK SKSKLEEKSE 780 

DSIAGSGVEH STPDTEPGKE ESWVSIKKFI PGRRKKRPDG KQEQAPVEDA GPTGANEDDS 840 

DVPAWPLSE YDAVEREKME AQQAQKGAEQ PEQKAATEVS KELSESQVHM MAAAVADGTR 900 

45 AATIIEERSP SWISASVTEP LEQVEAEAAL LTEEVLEREV IAEEEPPTVT EPLPENREAR 960 

GDTWSEAEL TPEAVTAAET AGPLGSEEGT EASAAEETTE MVSAVSQLTD SPDTTEEATP 1020 

VQEVEGGVPD IEEQERRTQE VLQAVAEKVK EESQLPGTGG PEDVLQPVQR AEAERPEEQA 1080 

EASGLKKETD WLKVDAQEA KTEPFTQGKV VGQTTPESFE KAPQVTESIE SSELVTTCQA 1140 

ETLAGVKSQE MVMEQAIPPD SVETPTDSET DGSTPVADFD APGTTQKDEI VEIHEENEVA 1200 

50 SGTQSGGTEA EAVPAQKERP PAPSSFVFQE ETKEQSKMED TLEHTDKEVS VETVSILSKT 1260 

EGTQEADQYA DEKTKDVPFF EGLEGSIDTG ITVSREKV'TE VALKGEGTEE AECKKDDALE 1320 

LQSHAKSPPS PVEREMWQV EREKTEAEPT HVNEEKLEHE TAVTV3EEVS KQLLQTVBVP 1380 

IIDGAKEVSS LEGSPPPCLG QEEAVCTKIQ VQSSEASFTL TAAAEEEKVL GETANILETG 1440 

ETLEPAGAHL VLEEKSSEKN EDFAAHPGED AVPTGPDCQA KSTPVIVSAT TKKGLSSDLE 1500 

55 GEKTTSLKWK SDEVDEQVAC QEVKVSVA1E DLEPENGILE LETKSSKLVQ NIIQTAVDQF 1560 

VRTEETATEM LTSELQTQAH VIKADSQDAG QETEKEGEEP QASAQDETPI TSAKEESEST 162 0 

AVGQAHSDIS KDM3EASEKT MTVEVEGSTV NDQQLEEWL PSEEEGGGAG TKSVPEDDGH 1680 

ALLAERIEKS LVEPKEDEKG DDVDDPENQN SAIiADTDASG GLTKESPDTN GPKQKEKEDA 1740 
QEVELQEGKV HSESDKAITP QAQEELQKQE RESAKSELTE S 

60 

Seq ID NO: 45 Nucleotide sequence: 
Nucleic Acid Accession #: NM_001290 

Coding sequence: 110 .. 1231 (underlined sequences correspond to start and stop codons) 

65 

1 11 21 31 41 51 

I I I I I I 

GTGAGCGTGT GTGCGTGCGT CTACTTTGTA CTGGGAAGAA CACAGCCCAT GTGCTCTGCA 60 

TGGACGTTAC TGATACTCTG TTTAGCTTGA TTTTCGAAAA GCAGGCAAGA TGTCCAGCAC 120 

70 ACCACATGAC CCCTTCTATT CTTCTCCTTT CGGCCCATTT TATAGGAGGC ATACACCATA 18 0 

CATGGTACAG CCAGAGTACC GAATCTATGA GATGAACAAG AGACTGCAGT CTCGCACAGA 240 

GGATAGTGAC AACCTCTGGT GGGACGCCTT TGCCACTGAA ITTTTTGAAG ATGACGCCAC 30 0 

ATTAACCCTT TCATTTTGTT TGGAAGATGG ACCAAAGCGA TACACTATCG GCAGGACCCT 360 

CATCCCCCGT TACTTTAGCA CTGTGTTTGA AGGAGGGGTG ACCGACCTGT ATTACATTCT 420 

75 CAAACACTCG AAAGAGTCAT ACCACAACTC ATCCATCACG GTGGACTGCG ACCAGTGTAC 48 0 

CATGGTCACC CAGCACGGGA AGCCCATGTT TAC CAAGGTA TGTACAGAAG GCAGACTGAT 540 
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CTTGGAGTTC ACCTTTGATG ATCTCATGAG 
AT AC CGAGAG TTAGTCCCGA GAAGCATCCT 
GGATCAGCTG TCCAAAAACA TCACCAGGAT 
CAGGTTGTGT GTAATATTGG AGCCAATGCA 
CCTCAGTCCC CGAGACTGCC TGAAGACCTG 
TCCGCCAGCA GAACCCACAA GGCAACCAAC 
CAGCAGCACT TCCAACAGCA GCGCTGGGAA 
GACCACAGCT GCAAACCTGA GTCTGTCCAG 
GCCAACTCTG ATGGGAGGTG AGTTTGGGGA 
AAACACGCAA TATGATGCGG CCAACGGCAT 
CGCGCTGGGG AACAACAGCC CGTGGAACAG 
AGAAAACCCC CCACCCCAGG CTTCCCAATA 
AGGCCCGTGG GTGATCATTA CAATTGCAAA 
ATAAAAACTT TTCCATGCAA ATATCTATTT 
CTTTCTTTTT TTCTAATTGA GAGGATTATT 
GGCCTTCACA GGTAATACAG ATACTGGCAC 
GCGCATCTTC TGGCACGGTT TTAACAACGT 
AACGAAGGCC ATATTGTCCA TAAATGCTCA 
TAACTACAGA TGACTTTTTA ATATTGTAAA 
AGTTTCTTGT TTCAGTAAAA AAAGAAAAGA 
ATGTACCTTA TTTTTTTTTT CTTTATGTTT 
AGCAAGGTAA TTTATGGTTG AGCTGATGTC 
TAGCCCAAGT GCTGAAACAA GAAATGT CAT 
TTAAGTAAAG AAAGACAATT GGACCCTTAA 
TCCAAATATT TTCAAGCCAT GTAATCCATT 
CTTTGTGTGT TTTCTAATTG TACCTGAGTT 
TGTATGATAT TTTGTAAAGC TCTCACCTGG 
AACTCCAGTG TATTTATGTG AAACTTTATA 
TATGTTCCTC CACACATGTA AAGGCACAGT 
TATGTATGCT TTACTGATAA GTGTGCCAAT 



AATCAAAACA TGGCACTTTA CCATTAGACA 600 

AGCCATGCAT GCACAAGATC CTCAGGTCCT 650 

GGGGCTAACA AACTTCACCC TCAACTACCT 720 

GGAACTGATG TCGAGACATA AAACTTACAA 780 

CTTGTTTCAG AAGTGGCAGA GGATGGTGGC 840 

AACCAAACGG AGAAAAAGGA AAAATTCCAC 900 

CAATGCAAAC AGCACTGGCA GCAAGAAGAA 950 

TCAGGTACCT GATGTGATGG TGGTAGGAGA 1020 

CGAGGACGAA AGGCTAATCA CTAGATTAGA 1080 

GGACGAC3AG GAGGACTTCA ACAATTCACC 1140 

TAAACCTCCC GCCACTCAAG AGACCAAATC 1200 

AGATGATCGG CACCAGAATC CACTGTCAAT 1250 

TCTTTACTTA CAGGAGAGGA AACAGAAGAG 1320 

CTAAACCACA ATGATCTGAT TTTCTTTCTT 13 80 

CCCAGTAAGC TTCCATGACC CTTTCTTGGA 1440 

TGATTGTAAT TAAAATGAGA GAAAACTCTA 1500 

GTTTGTGTTG AATTTCCTTT TTATGCATCA 1S60 

GTGCTCA3GA TCTCATTAAT ATGCCGAACC 1620 

ATATTTTCTG CTTTTTGACT TGCATCTGAG 1680 

CAAAAAAATC AGCTTTGGAA AGTAATTTAA 1740 

TCTTTCATTG GGCAACAGCT AAGAGGGCCC 18 00 

AATTGGTTCT TGTCTTGAGT CGACTCAATT 1860 

TTTTTTCATC AAAGACACCA GGGCAGATTT 1920 

GAATTTATGC ATTTGTAAAG TTGCTGTTGA 1980 

GGTTTTGTGG GCAGTTTAAT AAAC CTGAAC 2040 

GACCATCCTT TCTTTTTATA GTATATTTCT 2100 

TTCTTTTATG GGGACTTTTC GTTTTTGGGC 21 SO 

AGAGAATTAA TTTTTCCATT TGCATATTAA 2220 

GGCTCCGTGT GTTAAAAAAC AGCTGTATTT 22 80 
AATAAACTGT GTTAATGACC 



Seq ID NO: 46 Protein sequence: 
Protein Accession #: NPJD01281 



1 11 21 31 41 51 

I I I I I I 

MSSTPHDPFY SSPFGPFYRR HTPYMVQPEY RIYEMNKRLQ SRTEDSDNLW WDAFATEFFE 60 

DDATLTLSFC LEDGPKRYTI GRTLiIPRYFS TVFEGGVTDL YYILKHSKES YENS3ITVDC 120 

DQCTMVTQHG KPMFTKVCTE GRLILEFTFD DLMRIKTWHF TIRQYRELVP RSILAMHAQD 130 

PQVLDQLSKW ITRMGLTMFT LMYLRLCVIL EPMQELMSRH KTYWLSPRDC LKTCLFQKWQ 240 

RMVAPPAEPT RQPTTKRRKR KNSTSSTSNS SAGNNANSTG SKKKTTAANL SLSSQVPDVM 300 

WGEPTLMGG EFGDEDERLI TRLENTQYDA ANGMDDEEDF NNSPALGNNS PWNSKPPATQ 360 
ETKSENPPPQ ASQ 

Seq ID NO: 47 Nucleotide sequence: 
14126 

to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGCACGAGCT CGTGCCGGCC TTCAGTTGTT TCGGGACGCG CCGAGCTTCG CCGCTCTTCC 60 

AGCGGCTCCG CTGCCAGAGC TAGCCCGAGC CCGGTTCTGG GGCGAAAATG CCTGCCCTTC 120 

ACATCGAAGA TTTGCCAGAG AAGGAAAAAC TGAAAATGGA AGTTGAGCAG CTTCGCAAAG 180 

AAGTGAAGTT GCAGAGACAA CAAGTGTCTA AATGTTCTGA AGAAATAAAG AACTATATTG 240 

AAGAACGTTC TGGAGAGGAT CCTCTAGTAA AGGGAATTCC AGAAGACAAG AACCCCTTTA 300 

AAGAAAAAGG CAGCTGTGTT ATTTCATAAA TAACTTGGGA GAAACTGCAT CCTAAGTGGA 360 

AGAACTAGTT TGTTTTAGTT TTCCCAGATA AAAC CAACAT GCTTTTTAAG GAAGGAAGAA 420 

TGAAATTAAA AGGAGACTTT CTTAAGCACC ATATAGATAG GGTTATGTAT AAAAGCATAT 4 80 

GTGCTACTCA TCTTTGCTCA CTATGCAGTC TTTTTTAAGA GAGCAGAGAG TATCAGATGT 540 

ACAATTATGG AAATAAGAAC ATTACTTGAG CATGACACTT CTTTCAGTAT ATTGCTTGAT 600 
GCTTCAAATA AAGTTTTGTC TT 



1 I I I I I 

MPALHIEDLP EKEKLKMEVE QLRKEVKLQR QQVSKCSEEI XNYIEERSGE DPLVKGIPED 
KNPFKEKGSC VIS 



Seq ID NO: 49 Nucleotide sequence: 
Nucleic Acid Accession #: XM_051896 

Coding sequence: 139.. 2388 (underlined sequences correspond to start and stop codons) 
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5 GTTTTAAAGA CGCTAGAGTG CCAAAGAAGA CTTTGAAGTG TGAAAACATT TCCTGTAATT SO 

GAAACCAAAA TGTCATTTAT AGATCCTTAC CAGCACATTA TAGTGGAGCA CCAGTATTCC 120 

CACAAGTTTA CGGTAGTGGT GTTACGTGCC ACCAAAGTGA CAAAGGGGGC CTTTGGTGAC 180 

ATGCTTGATA CTCCAGATCC CTATGTGGAA CTTTTTATCT CTACAACCCC TGACAGCAGG 240 

AAGAGAACAA GACATTTCAA TAATGACATA AACCCTGTGT GGAATGAGAC CTTTGAATTT 300 

10 ATTTTGGATC CTAATCAGGA AAATGTTTTG GAGATTACGT TAATGGATGC CAATTATGTC 350 

ATGGATGAAA CTCTAGGGAC AGCAACATTT ACTGTATCTT CTATGAAC-GT GGGAGAAAAG 420 

AAAGAAGTTC CTTTTATTTT CAACCAAGTC ACTGAAATGG TTCTAGAAAT GTCTCTTGAA 480 

GTTTGCTCAT GCCCAGACCT ACGATTTAGT ATGGCTCTC-T GTGATCAGGA GAAGACTTTC 540 

AGACAACAGA GAAAAGAACA CATAAGGGAG AGCATGAAGA AACTCTTGGG TC C AAAGAAT 600 

15 AGTGAAGGAT TGCATTCTGC ACGTGATGTG CCTGTGGTAG CCATATTGGG TTCAGGTGGG 660 

GGTTTCCGAG CCATGGTGGG ATTCTCTGGT GTGATGAAGG CATTATACGA ATCAGGAATT 720 

CTGGATTGTG CTACCTACGT TGCTGGTCTT TCTGGCTCCA CCTGGTATAT GTCAACCTTG 780 

TATTCTCACC CTGATTTTCC AGAGAAAGGG CCAGAGGAGA TTAATGAAGA ACTAATGAAA 340 

AATGTTAGCC ACAATCCCCT TTTACTTCTC ACACCACAGA AAGTTAAAAG ATATGTTGAG 900 

20 TCTTTATGGA AGAAGAAAAG CTCTGGACAA CCTGTCACCT TTACTGATAT CTTTGGGATG 9S0 

TTAATAGGAG AAACACTAAT TCATAATAGA ATGAATACTA CTCTGAGCAG TTTGAAGGAA 1020 

AAAGTTAATA CTGCACAATG CCCTTTACCT CTTTTCACCT GTCTTCATGT CAAACCTGAC 1080 

GTTTCAGAGC TGATGTTTGC AGATTGGGTT GAATTTAGTC CATACGAAAT TGGCATGGCT 1140 

AAATATGGTA CTTTTATGGC TCCCGACTTA TTTGGAAGCA AATTTTTTAT GGGAACAGTC 1200 

25 GTTAAGAAGT ATGAAGAAAA CCCCTTGCAT TTCTTAATGG GTGTCTGGGG CAGTGCCTTT 1260 

TCCATATTGT TCAACAGAGT TTTGGGCGTT TCTGGTTCAC AAAGCAGAGG CTCCACAATG 1320 

GAGGAAGAAT TAGAAAATAT TACCACAAAG CATATTGTGA GTAATGATAG CTCGGACAGT 13 80 

GATGATGAAT CACACGAACC CAAAGGCACT GAAAATGAAG ATGCTGGAAG TGACTATCAA 1440 

AGTGATAATC AAGCAAGTTG GATTCATCGT ATGATAATGG CCTTGGTGAG TGATTCAGCT 1500 

30 TTATTCAATA CCAGAGAAGG ACGTGCTGGG AAGGTACACA ACTTCATGCT GGGCTTGAAT 1560 

CTCAATACAT CTTATCCACT GTCTCCTTTG AGTGACTTTG CCACACAGGA CTCCTTTGAT 1620 

GATGATGAAC TGGATGCAGC TGTAGCAGAT CCTGATGAAT TTGAGCGAAT ATATGAGCCT 1680 

CTGGATGTCA AAAGTAAAAA GATTCATGTA GTGGACAGTG GGCTCACATT TAACCTGCCG 1740 

TATCCCTTGA TACTGAGACC TCAGAGAGGG GTTGATCTCA TAATCTCCTT TGACTTTTCT 1800 

35 GCAAGGCCAA GTGACTCTAG TCCTCCGTTC AAGGAACTTC TACTTGCAGA AAAGTGGGCT 1860 

AAAATGAACA AGCTCCCCTT TCCAAAGATT GATCCTTATG TGTTTGATCG GGAAGGGCTG 1920 

AAGGAGTGCT ATGTCTTTAA ACCCAAGAAT CCTGATATGG AGAAAGATTG CCCAACCATC 1980 

ATCCACTTTG TTCTGGCCAA CATCAACTTC AGAAAGTACA GGGCTCCAGG TGTTCCAAGG 2 040 

GAAACTGAGG AAGAGAAAGA AATCGCTGAC TTTGATATTT TTGATGACCC AGAATCACCA 2100 

40 TTTTCAACCT TCAATTTTCA ATATCCAAAT CAAGCATTCA AAAGACTACA TGATCTTATG 2160 

CACTTCAATA CTCTGAACAA CATTGATGTG ATAAAAGAAG CCATGGTTGA AAGCATTGAA 2220 

TATAGAAGAC AGAATCCATC TCGTTGCTCT GTTTCCCTTA GTAATGTTGA GGCAAGAAGA 2280 

TTTTTCAACA AGGAGTTTCT AAGTAAACCC AAAGCATAGT TCATGTACTG GAAATGGCAG 2340 

CAGTTTCTGA TGCTGAGGCA GTXTGCAATC CCATGACAAC TGGATTTAAA AGTACAGTAC 2400 

45 AGAT AGT CGT ACTGATCATG AGAGACTGGC TGATACTCAA AGTTGCAGTT ACTTAGCTGC 2460 

ATGAGAATAA TACTATTATA AGTTAGGTTG ACAAATGATG TTGATTATGT AAGGATATAC 2520 

TTAGCTACAT TTTCAGTCAG TATGAACTTC CTGATACAAA TGTAGGGATA TATACTGTAT 2580 

TTTTAAACAT TTCTCACCAA CTTTCTTATG TGTGTTCTTT TTAAAAATTT TTTTTCTTTT 2640 

AAAATATTTA ACAGTTCAAT CTCAATAAGA CCTCGCATTA TGTATGAATG TTATTCACTG 2700 

50 ACTAGATTTA TTCATACCAT GAGACAACAC TATTTTTATT TATATATGCA TATATATACA 2760 
TACATGAAAT AAATAC AT C A ATATAAAAAT 

Seq ID NO: 50 Protein sequence: 
^ Protein Accession #: XP_051896 

1 11 21 31 41 51 

III 

MSFIDPYQHI IVEHQYSHKF TVWLRATKV TKGAFGDMLD TPDPYVELFI STTPDSRKRT 60 

RHFNNDINPV WNETFEFILD PNQKNVIiEIT LMDANYVMDE TLGTATFTVS SMKVGEKKEV 120 

60 PFIFNQVTEM VLEMSLEVCS CPDLRFSMAL CDQEKTFRQQ RKEHIRESMK KLLGPKNSEG 180 

LHSARDVPW AILGSGGGFR AMVGFSGVMK ALYESGILDC ATYVAGLSGS TWYMSTLYSH 240 

PDFPEKGPEE INEELMKNVS HNPLLLLTPQ KVKRYVESLW KKKSSGQPVT FTDIFGMLIG 300 

' ETLIHNRMNT TLSSLKEKVN TAQCPLPLFT CLHVKPDVSE LMFADWVEFS PYEIGMAKYG 360 

TFMAPDLFGS KFFMGTWKK YEENPLHFLM GVWGSAFSIL FNRVLGVSGS QSRGSTMEEE 420 

65 LENITTKHIV SNDSSDSDDE SHEPKGTENE DAGSDYQSDN QASWIHRMIM ALVSDSALFN 480 

TREGRAGKVH NFMLGLNLNT SYPLSPLSDF ATQDSFDDDE LDAAVADPDE FERIYEPLDV 540 

KSKKIHWDS GLTFNLPYPL ILRPQRGVDL IISFDFSARP SDSSPPFKEL LLAEKWAKMN 600 

KLPFPKIDPY VFDREGLKEC YVFKPKNPDM EKDCPTIIKF VLANINFRKY KAPGVPRETE 660 

EEKEIADFDI FDDPESPFST FNFQYPNQAF KRLHDLMHFN TLNNIDVIKE AMVESIEYRR 720 

70 QNPSRCSVSL SNVEARRFFN KEFLSKPKA 



Seq ID NO: 51 Nucleotide sequence: 
Nucleic Acid Accession #: NM_00S528 
75 Coding sequence: 57.. 764 (underlined sequences correspond to start and stop codons) 
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1 11 21 31 41 51 

1 I I I I I 

GCCGCCAGCG GCTTTCTCGG ACGCCTTGCC CAGCGGGCCG CCCGACCCCC TGCACCATGG SO 

ACCCCGCTCG CCCCCTGGGG CTGTCGATTC TGCTGCTTTT CCTGACGGAG GCTGCACTGG 120 

5 GCGATGCTGC TCAGGAGCCA ACAGGAAATA ACGCGGAGAT CTGTCTCCTG CCCCTAGACT 180 

ACGGACCCTG CCGGGCCCTA CTTCTCCGTT ACTACTACGA CAGGTACACG CAGAGCTGCC 240 

GCCAGTTCCT GTACGGGGGC TGCGAGGGCA ACGCCAACAA TTTCTACACC TGGGAGGCTT 300 

GCGACGATGC TTGCTGGAGG ATAGAAAAAG TTCCCAAAGT TTGCCGGCTG CAAGTGAGTG 3 SO 

TGGACGACCA GTGTGAGGGG TCCACAGAAA AGTATTTCTT TAATCTAAGT TCCATGACAT 420 

10 GTGAAAAATT CTTTTCCGGT GGGTGTCACC GGAACCGGAT TGAGAACAGG TTTCCAGATG 480 

AAGCTACTTG TATGGGCTTC TGCGCACCAA AGAAAATTCC ATCATTTTGC TACAGTCCAA 540 

AAGATGAGGG ACTGTGCTCT GCCAATGTGA CTCGCTATTA TTTTAATCCA AGATACAGAA 600 

CCTGTGATGC TTTCACCTAT ACTGGCTGTG GAGGGAAIGA CAATAACTTT GTTAGCAGGG 6 SO 

AGGATTGCAA ACGTGCATGT GCAAAAGCTT TGAAAAAGAA AAAGAAGATG CCAAAGCTTC 720 

15 GCTTTGCCAG TAGAATCCGG AAAATTCGGA AGAAGCAATT TTAAACATTC TTAATATGTC 780 

ATCTTGTTTG TCTTTATGGC TTATTTGCCT TTATGGTTG? ATCTGAAGAA TAATATGACA 840 

GCATGAGGAA ACAAATCATT GGTGATTTAT TCACCAGTTT TTATTAATAC AAGTCACTTT 900 

TTCAAAAATT TGGATTTTTT TATATATAAC TAGCTGCTAT TCAAATGTGA GTCTACCATT 950 

TTTAATTTAT GGTTCAACTG TTTGTGAGAC GAATTCTTGC AATGCATAAG ATATAAAAGC 1020 

20 AAATATGACT CACTCATTTC TTGGGGTCGT ATTCCTGATT TCAGAAGAGG ATCATAACTG 10 80 

AAACAACATA AGACAATATA ATCATGTGCT TTTAACATAT TTGAGAATAA AAAGGACTAG 1140 
CC 

Seq ID NO: 52 Protein sequence: 
25 Protein Accession #: NP_006519 

1 11 21 31 41 51 

I I I 1 I I 

MDPARPLGLS illlflteaa lgdaaqeptg nnaeicllpl dygpcralll ryyydrytqs so 

30 CRQFLYGGCE GNANNFYTWE ACDDACWRIE KVPKVCRLQV SVDDQCEGST EKYFFNLSSM 120 

TCEKFFSGGC HRNRIENRFP DEATCMGFCA PKKIPSFCYS PKDEGLCSAN VTRYYFNPRY 1B0 
RTCDAFTYTG CGGNDNNFVS REDCKRACAK ALKKKKKMPK LRFASRIRKI RKKQF 



35 Seq ID NO: 53 Nucleotide sequence: 
Nucleic Acid Accession #: AA478778 
Coding sequence: no ORF found 



40 1 11 21 31 41 51 

I I I I I I 

TATTTTTGTA CGTAAAATGA TTCTATTATG ACTGCCTTTG CATGTAGTAA TATGACAAAG 60 

TGATCCTTCA TTATCACGGT ACACTATTGT TTACTTTTCA TCTGTAAATG TTTTATTGTT 120 

ACTTTTTTAA AATGAATTTT TTTAAAACAA TCTAGCCATC ATCAAGGTGC TATAAGAGTT 1B0 

45 GTATAAAAGA TATTTTTGGC ATTTCTAGGC AAGTATCAGC CAATAAGTAT GTTAGTGATA 240 

TCACAGATTG TACCAACTAT TAACTATGTT AAATAAGTAT TCAGTTTCAT GTGATCTCTG 3 00 

GGAAAAAAAT ATGCTGCCTT GGTGCTAATA TTGTATGTAT TTAAATGATC ATCTGACTCA 3 SO 

GAAATATAAA CACTTTTAAT GAAAGGGAGG AACGGAAGGA CAATTTCCAG TGCACAGAAT 420 

CACTTGGATG AAATAAGACC AGCTCTTTAC CCTTATTTTT GGATATGCCT TTTTTGGAAG 4 B0 

50 AGACTTAGAC TTTATCCTTA TTGTTGTTAG TGTTGTTAAT ATTCGTTGCT TCAGCCCACG 540 

GTGCCTTGGT CTCTCCACAA TCAAATGGAG GATCCCCCAA GCAGCTTCAT TACAGAGTGA GOO 

TATTGGGAAA GTGAGATCCT CTCACCATTT TGCCAAGATA CTCTAAAATG ACATCCAAGT SS0 

TTACCAGTAG AAAGACACAG GATGCACAGA ATGGGCATGA CCTTCAGCTC ACGAGCACAC 720 

CTGGAGAAAT TCAGAACCAG GTTCTGAATC ATCACGATTG CCTTTTGCAT GAAAACATCG 7B0 

55 GCTGGTGATG TGACTTCTCT TCAGGCCATG AGCCTAACAY CCTGCCGGTT TTCATGCCCG 840 

CTGCAGTAAT GGACGTTTGT GTGAAGAAAT GAACTGTGGA GTACAAAATG CTTTGAGTCT 900 

TTCCGATTGC TCATTAATTC ACTTTTTTGT TACTTCTTTC CAAAATGGAA GTGCTGAAGC 950 

CATGGTCTTT CTGCCCCTCC AAGCTGATGA AGGGAAGCCT TTGCCAATGG CCCATGGAAG 1020 

ACACTTGGTT TGAGAAACCC TGCCCACTTC CAAAGACCAA AGAGATTAGG AAAAGCCTGG 1080 

60 CAGTATTCTC CAACTCCAAA CAAGCTCTAG AGTGCTCCAG GAAAAGTTAT ATTCAGTATA 1140 

TGAATAAGTG TTATTCTCCA TTATTAATGT GTTCTGAAAA TATATTATGA ATAAATACAT 1200 
CACCACACCC AAAAAAAAAA AAAAAAAAAA AAAA 



65 Seq ID NO: 54 Nucleotide sequence: 
Nucleic Acid Accession #: NM_020663 

Coding sequence: 1..S45 (underlined sequences correspond to start and stop codons) 



70 1 11 21 31 41 51 

I I I I 1 I 

ATGAACTGCA AAGAGGGAAC TGACAGCAGC TGCGGCTGCA C-GGGCAACGA CGAGAAGAAG SO 

ATGTTGAAGT GTGTGGTGGT GGGGGACGGT GCCGTGGGGA AAACCTGCCT GCTGATGAGC 120 

TACGCCAACG ACGCCTTCCC AGAGGAATAC GTGCCCACTG TGTTTGACCA CTATGCAGTT 180 

75 ACTGTGACTG TGGGAGGCAA GCAACACTTG CTCGGACTGT ATGACACCGC GGGACAGGAG 240 

GACTACAACC AGCTGAGGCC ACTCTCCTAC CCCAACACGG ATGTGTTTTT GATCTGCTTC 300 
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TCTGTCGTAA ACCCTGCCTC TTACCACAAT GTCCAGGAGG AATGGGTCCC CGAGCTCAAG 360 

GACTGCATGC CTCACGTGCC TTATGTCCTC ATAGGGACCC AGATTGATCT CCGTGATGAC 420 

CCAAAAACCT TGGCCCGTTT GCTGTATATG AAAGAGAAAC CTCTCACTTA CGAGCATGGT 4 80 

GTGAAGCTCG CAAAAGCGAT CGGAGCACAG TGCTACTTGG AATGTTCAGC XCTGACTCAG 540 

5 AAAGGTCTCA AAGCGGTTTT TGATGAAGCA ATCCTCACCA TTTTCCACCC CAAGAAAAAG 600 
AAGAAACGCT GTTCTGAGGG TCACAGCTGC TGTTCAATTA TCTGA 



Seq ID NO: 55 Protein sequence: 
10 Protein Accession #: NP_065714 

1 11 21 31 41 51 

I I I I I I 

MNCKEGTDSS CGCRGNDEKK MLKCVWGDG AVGKTCLLMS YANDAFPEEY VPTVPDHYAV 60 
15 TVTVGGKQHL LGLYDTAGQE DYNQLRPLSY PNTDVFLICF SWNPASYHN VQEEWVPELK 120 
DCMPHVPYVL IGTQIDLRDD PKTLARLLYM KEKPLTYEHG VKIAKAIGAQ CYLECSALTQ 18 0 
KGLKAVFDEA 1LTIFHPKKK KKRCSEGHSC CSII 



20 Seq ID NO: 56 Nucleotide sequence : 

Nucleic Acid Accession #: fgenesh prediction 

Coding sequence: 1-546 (underlined sequences correspond to start and stop codons) 



25 1 11 21 31 41 51 

I I I I I I 

ATGGCCTTGG GCAGCTCCGC CCCTGTGGCT TTGCAGGGTA ATGCCCACTT CCCTGCTGCT 60 

TTCATGGCTG GCATTAAGTG TCTGTGGCTT TTCCAGGTAG TCCCCCTGGG GCTCCCCGAG 120 

TTGGTGCAAA GGCTCCTGGG TGGAGCTCGA ACTGAAACTC GCTTTGTGCC CGCAGCCCTG 18 0 

30 CAGCTCGCCG GTGCCCTCGA CCTGCCCGCT GGGTCCTGTG CCTTTGAAGA GAGCACTTGC 24 0 

GGCTTTGACT CCGTGTTGGC CTCTCTGCCG TGGATTTTAA ATGAGGAAGG CCAGCAACCT 3 00 

TTCTGGTCCT CAGGAGACAT GTCTGACTGG GACTACTGGG TTGGCTGGCG GAAGTTAATT 3 60 

CATTCTCCTC TGAGCACTCC AGGGTGGAGC AGGCAGGTTA GGCTCCAGTT GTTCCAGCTT 42 0 

CAGTTTGTCA AAGGCCAGAA CTTGGACGTA ACAGTGTACT GCAGGCTCCA GGGCAGTGAG 480 

35 AAACCCTTTG AAACTGGTTC CATGGTTCCA TTCACCTTCA TGTACTGGAT CCACCATGGA 540 



fgenesh prediction 



I I I I I I 

MALGSSAPVA LQGNAHFPAA FHAGIKCLWL FOWPLGLFE LVQRLLGGAR TETRFVPAAL 

QLAGALDLPA GSCAFEESTC GFDSVLASIiP WILNEEGQQP FWSSGDMSDW DYWVGWRKLI 

HSPLSTPGWS RQVRLQLFQL QFVKGQNLDV TVYCRLQGSE KPFETGSMVP FTFMYWIHHG 



Seq ID NO: 58 Nucleotide s 
Nucleic Acid Accession #: XM_050478 
Coding sequence: 27.. 4508 (underlined sequences correspond to s 



31 



41 



51 



: and stop codons) 



I I I I I I 

CCGGCGGCGC CTGAGCCCAG CCGAGGATGG AGAACCGGCC TGGGTCCTTC CAGTACGTCC 60 

CTGTGCAGCT GCAAGGGGGG GCACCCTGGG GCTTCACCCT TAAGGGGGGT CTGGAACACT 12 0 

GTGAGCCGCT CACAGTGTCT AAGATTGAAG ATGGAGGCAA GGCAGCTTTG TCCCAGAAGA 180 

TGAGGACTGG TGATGAGCTG GTGAATATCA ATGGCACTCC ATTATATGGC TCCCGCCAAG 240 

AGGCCCTCAT TCTCATCAAA GGCTCCTTCC GGATTCTCAA GCTGATTGTC AGGAGGAGGA 300 

ACGCCCCTGT CAGTAGGCCG CACTCATGGC ATGTGGCCAA GCTGCTGGAG GGATGCCCTG 360 

AAGCAGCCAC CACCATGCAT TTCCCTTCTG AAGCCTTCAG CTTGTCCTGG CATTCTGGCT 42 0 

GCAACACAAG TGACGTGTGT GTGCAGTGGT GTCCACTCTC CCGGCATTGC AGCACCGAGA 480 

AAAGCAGCTC CATTGGCAGC ATGGAGAGCC TGGAGCAACC AGGCCAAGCC ACCTATGAGA 540 

GCCATCTGTT GCCTATTGAC CAGAACATGT ACCCTAACCA GCGTGACTCA GCCTACAGCT 600 

CCTTCTCGGC CAGCTCAAAT GCTTCTGACT GTGCCCTTTC CCTCAGGCCA GAGGAGCCAG 660 

CCTCTACAGA CTGCATCATG CAAGGCCCAG GGCCAACTAA GGCCCCCAGT GGCCGGCCTA 72 0 

ATGTGGCTGA GACCTCAGGA GGTAGTCGGC GCACCAATGG GGC-CCACC?G ACCCCCAGCT 78 0 

CTCAGATGTC ATCCCGTCCA CAGGAGGGAT ACCAGTCAGG GCCCGCCAAA GCAGTCAGGG 84 0 

GCCCACCACA ACCTCCAGTG AGGCGGGACA GCCTTCAGGC CTCCAGAGCC CAACTCCTCA 900 

ATGGAGAGCA GCGCAGGGCA TCTGAGCCTG TGGTCCCCTT GCCACAGAAG GAGAAACTGA 960 

GCTTAGAGCC TGTGCTACCC GCAAGGAACC CTAATAGGTT CTGTTGCCTC AGTGGGCATG 1020 

ACCAAGTGAC AAGTGAGGGC CATCAGAACT GTGAGTTCAG TCAGCCTCCT GAATCCAGCC 1080 

AACAGGGCTC TGAGCATCTA CTGATGCAGG CCTCAACCAA AGCTGTTGGA TCCCCAAAAG 114 0 

CCTGTGACAG AGCTTCCAGC GTGGATTCCA ACCCACTCAA TGAGGCTTCT GCAGAGCTAG 1200 

CTAAGGCTTC TTTTGGCAGA CCTCCACATC TCATAGGACC CACAGGGCAT CGCCATAGTG 1260 

CCCCTGAACA GCTGGTGGCA TCCCACCTGC AGCATGTGCA CCTTGATACC AGGGGCAGCA 1320 

AAGGGATGGA GCTCCCACCC GTACAGGATG GGCACCAGTG GACTCTGTCC CCTTTGCACA 1380 
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GCAGCCACAA AGGGAAGAAA AGTCCATGCC 
GCAAAGAAAG AAAGACCAGA CAAGTGGATG 
AAAGCAGTCC CCCACATGGA GAGGCTGATG 
CAAACAGAAC AAGCAGAGCA GCCAGTGAAT 
5 CCCTTGTTCA ACAAGCCACG GACTGTTCTT 
CAGGTGAAGA AGGGGACAGC GAGCCCAAGG 
GAGGGACCCG GGGCCGCTCG ATCCAAAACC 
TGCGTAATGA AATTCAGAGG AGGAAGGCCC 
AGCTGTGTGA CACTAAGGAG CCAGTGGAAG 

10 TCACTGCCTC TAACACATCT CTTCTATCTT 
AGCTCTTCAA CAAAAGCATG ATGCTCAGAG 
CTGAGAGCCA TGAATCTAGG ACAGGCTTAG 
GCCAGTCCTC TTTGGGCCTG AACACCTGGT 
CTGAGAAAGC ACATGCTCAC TGTGGAGTCC 

15 ATAATTCACA GCCACTTGTG GCAGCAGCCA 
AGGAATTGAA GGCTTCTACT GCTCAAGCTG 
ACAGAAGAAA GTTCTTTGAA GAGAGTAGCA 
TAACCACTCA TAGCAACAAG ACTTTTACCC 
AGCCAATGAG CTCCAGCTGT AGGGAATTGA 

20 CCGCAGACCA ACCATATCAT GCCACAGACC 
CAGAAACTCC CACTTACTCA GAATGTTTTG 
GTAAGCCACT ACACTGTGGT GATTTTGATT 
TTCAAGGAGC TCTAGTCCAT GATCCTTGCA 
TGCTAAAGAG AAATATGATG CCAAATTGCT 

25 TTCGGTGTTC AGTTTGCTAT CATAATCCTC 
CACCTGGCAA CACTTGGAAA CCCAGGAAGC 
GGAATCCAAT AACAGGAAAC AGGAAGACCA 
AGACTAGCTT TTCATGGGCA ACCCCTTTCC 
TGTCAAGCTA CCGAGCAATT TCTTCTCTTG 

30 AAAAATCAGA GGAAACTTCA GTTTATGAGG 
CACTGCGCAG CCGTGCCTTC TCAGAGAGTC 
CCTGGGGGCA GCATAGGAGG GAGCTCTTTA 
TCGGAGCCAG GAAGAAGGCC TTTCCTCCTC 
ACAGGCTCTT TCGTGCAGCC CAGCAGCAGA 

35 AQGAGGAGGA GGAGGAGGAA GAAGAAGAAG 
CAGAGGAGGA GGAAGAGGAG CTGCCACCCC 
GTGCTCTCAA TCCTGAGGAG GTCCTAGAGC 
AGGGCTCGAG ACAGGGTTCA CAAAGTGTCC 
CCAGTGATTT CTTGCCTCCA ATAAGGGGTC 

40 CCCCTTGCTA CTATGGCATT GGTGGGCTTT 
CCGCCAAACA AGAGTTTCAG CACTTTTCGC 
CTTACTCAGC TTATTACAAT ATTTCTGTGG 
ACCAACCTGA GATGGCAGAG ATTGGCCTAG 
AAAAAAAGAT ACAGCTTATC GAAAGCATCA 

45 AGCGAGGGCT GCTAGAGGAC ATCAATGCCA 
ACTTAAAAGC CGTCTGCAAA TCCAATGAAT 
TGGACAAAGT GGTCAACCTG TTGCTGTCAC 
CTCTGAACAG CATCGATTCA GAGGCCAACC 
AGCAGCTGAC GGGGCAGTTG GCAGATGCCA 

50 AGAAGTTGGT GTTTGGCATG GTCTCCCGCT 
AGCACTTTGT CAAGATGAAA TCTGCTCTCA 
TCAAGCTCGG GGAAGAGCAA CTCAAATGTC 
ATTTCTAATT CTACCAGCAC TCTGCCACAG 
TCAATCTTCT TTGTTAGCAG TTTCTCAGCA 

55 CCTCTACCCT GGATGTCTCT CACTACCCCT 
ACCCTGGGGA AGCCACAAGC TTCTACCCAA 
CACACTCTCC TTCCCACAGT TGCCAAGGGC 
TGCCTTCATT CTGCTTTGTA CTAGGACACC 
ATCATCAACA GCCTCTAAAG GCTCAGAGGG 

60 GGCTTGTGGC CAGCCATTTC T C AC AGAGAG 
CAGTTTCAGG GCCTCACCCA AGCTTTGCAG 
AAAAAATGCA AGCAAAGGTT GAGTACCCCC 
ATAGGCTCTA CCCTTACCTT TCCCAGCAGC 
CTGGCTAGTG TGACCCTCTT CCTGTCCTAA 

65 TTTCCTTTAC ATTGCTGGGG GTTACCGCAG 
CATTAATAGC TCTACTAAAA CTGACTTCTA 
TCTTATTGTT ATATTTTAAA TGGCCTTTTG 
TTTTCTTTTT TAACTAATAA GGCGAGAAGA 
AAGGAAAGCA TTTTCTGCAG ATCAGCCTGA 

70 ' TCTCGTGTTG CTCACAACTA CCTGCCTGGA 
TAAAACACAA GATCAAATGA ACAATCCGAA 
CATGGTGGCT CACGCCTGAA ATCCCAGCAC 
GTCAGGAGAT CAAGACCATC CTGGCTAACA 
AAAAATTAGC CAGGTGTGGT GGCACGCACC 

75 AGGAGAATTG CTTGAACCTG G AAGG CAGAG 
CTCCATCCTG GGCAACAGAG TGAGACTTTG 



CCCCTACAGG AGGAACCCAT GACCAGTCCA 144 0 

ACAGGTCTTT AGTTTTGGGA CACCAGAGCC 1500 

GACACCCCTC AGAAAAAGGT TTCCTGGACC 1560 

TGGCCAACCA GCAACCCTCT GCCTCTGGCT 1S2 0 

CAACCACTAA AGCAGCTAGT GGCACAGAGG 1580 

AGTGCAGCCG GATGGGTGGT AGGCGAAGTG 174 0 

GGCGGAAGAG TGAGCGTTTT GCTACCAATC 1800 

AGCTCCAGAA AAGCAAGGGT CCCTTGTCAC 1860 

AGACCCAGGA GCCCCCAGAA AGTCCTCCAC 1920 

CATGTAAAAA ACCTCCCAGC CCCAGAGACA 1980 

CTAGGTCTTC CGAGTGCCTC AGCCAAGCCC 204 0 

AGGGACGAAT AAGCCCTGGC CAGAGGCCTG 2100 

GGAAAGCACC TGACCCATCC TCCTCAGACC 2160 

GTGGAGGTCA TTGGAGATGG TCTCCAGAGC 222 0 

TGGAAGGCCC TTCCAACCCA GGTGACAACA 2280 

GGGAGGATGC CATCCTCTTG CCTTTTGCAG 234 0 

AATCCTTATC TACATCTCAT TTGCCAGGTT 2400 

AGAGACCAAA ACCTATAGAC CAAAACTTCC 2460 

GGCGCCATCC CATGGACCAA TCATATCATT 2520 

AATCATATCA TTCCATGTCA CCCCTTCAGT 2580 

CAAGCAAAGG TCTAGAAAAT TCCATGTGTT 264 0 

ACCACAGGAC CTGCTCTTAC TCCTGCAGTG 2700 

TTTATTGTTC TGGGGAAATC TGCCCTGCCT 2760 

ACAACTGCCG GTGCCACCAC CACCAATGCA 282 0 

AGCACAGTGC CCTCGAGGAC AGCAGCTTGG 2880 

TGACAGTGCA GGAATTTCCT GGGGACAAAT 294 0 

GCCAGTCAGG GAGGGAAATG GCTCATTCCA 3000 

ATCCTTGCCT TGAGAACCCA GCACTGGACT 3060 

ACCTCCTTGG AGACTTCAAA CATGCTTTGA 312 0 

AGGGGAGCTC CCTTGCCTCC ATGCCCCACC 3180 

ACATCAGCTT GGCGCCCCAA AGCACCCGGG 324 0 

GCAAAGGTGA TGAGACCCAG TCGGATCTTC 330 0 

CTCGCCCTCC TCCTCCCAAC TGGGAGAAGT 3360 

AGCAGCAACA GCAGCAGCAG AAGCAACAGG 342 0 

AAGAGGAAGA GGAAGAGGAG GAGGAGGAGG 34 SO 

AGTATTTCAG TTCAGAAACC TCTGGTTCCT 3540 

AGCCACAACC CCTCAGCTTT GGCCACCTGG 3600 

CAGCAGAGCA AGAATCCTTT GCACTCCATT 3660 

ACTTGGGATC TCAACCTGAG CAGGCTCAGC 3720 

GGAGGACATC GGGACAGGAA GCCACTGAAT 3780 

CTCCTTCAGG GGCCCCAGGA ATCCCTACCT 3 84 0 

CCAAGGCAGA GCTGCTGAAC AAACTGAAAG 3900 

GAGAGGAGGA AGTTGACCAT GAACTGGCTC 3960 

GCAGAAAACT TTCTGTCTTG CGGGAGGCCC 402 0 

ATTCTGCCCT TGGGGAGGAG GTGGAGGCCA 4080 

TTGAAAAGTA CCACTTGTTT GTTGGGGACC 414 0 

TCTCTGGACG ACTGGCCCGG GTGGAGAATG 4200 

AGGAGAAGTT GGTACTGATA GAGAAGAAGC 4260 

AGGAGCTGAA GGAGCACGTG GACCGCCGGG 432 0 

ACCTGCCTCA GGACCAGCTC CAAGATTACC 438 0 

TCATTGAACA GCGAGAGCTG GAGGAGAAGA 444 0 

TCAGGGAGAG TCTACTCCTG GGGCCCAGCA 4500 

CATCCCTGCC CAGCCATGTG GGAAGTGCTT 4560 

AGTAGATAGC AATTAGCAGT TTGTTCCAGC 4S2 0 

TCCCTAGCAG TGGTCCTAAC CAGCTAGGAG 4680 

GGGAGCTGCA GCAAGGTGTG ATCTTAGAAC 474 0 

AAGTACTTGC TGCACAGAGA ACCAAGGAAG 4800 

AAAGACATCA AGTACTCATC ACCCACCCAT 4860 

AATCTGCCTT GCAGCTCTAC TCTGCCCCAG 4920 

CTGGCTGCCT TGAGGGCATT CACCTGGCAC 4980 

GGGAAAGCAC AGAGGGAGGA ATTACACTGA 5040 

AGGTGCCCCT TAGGAAGGAA CCAGGTTTAA 5100 

AAGTTCAGGG GAAGAGGCCT ACTCTTAGCC 5160 

GACTTTGGTC CTACCACCTC TTGTTTCATC 522 0 

GTGCCTACCC CAGGGCTTCA CCATATGGGC 528 0 

GATGTAGGTT TCATTATTGG GGGAGGGGGT 534 0 

ATTTTATTTA TTTTTATGTT TTGATTATTT 540 0 

GGGAAGTTGG AGAGGGAAAA GTTAGCCCAG 5460 

ATCCACCGTG GCTAGGCATA TTCTTGCTCT 552 0 

TGAATTTAGG AAAGTTGCAG GATACAAGGT 558 0 

AATGTTATTA AGAAAACAGT TCCGGCCGGG 564 0 

TTTGGGAGGC CGAGGCAGGT GGATCACGAG 5700 

CGGTGAAACC CTATCTCTAC TAAAAATACA 5760 

AGTAGTCCCA GCTACTCGGG AGGCTGAGGC 582 0 

ATTGCAGTGA GCTGAGACCA CACCACTGCA 5880 

TCTCAAAAAG AAAGAAAGAA AGAAAGAAAG 5940 
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AAAGAAAGAA AGAAAAGAAA GAAAGAAAGA AAGAAAGAAA ACAGTTCCAT TTACAATAGC 60 ( 



MYPNQRDSAY 
RRTNGGHLTP 
PWPLPQKEK 
QASTKAVGSP 
LQHVHLDTRG 



11 

I 

r VPVQLQGGAP 
QEALILIKGS 
GCNTSDVCVQ 
SSFSASSNAS 
SSQMSSRPQE 
LSLEPVLPAR 



WGFTUKGGLE 
FRIIiKLIVRR 
WCPLSRHCST 
DCALSLRPEE 
GYQSGPAKAV 
NPNRFCCLSG 
SNPLNEASAE 
DGHQWTLSPL 



HCEPLTVSKI 
RNAPVSRPHS 
EKSSSIGSME 
PASTDCIMQG 



EDGGKAALSQ KMRTGDELVN 

WHVAKLIiEGC PEAATTMHFP 

SIiEQPGQATY ESHLLPIDQN 

PGPTKAPSGR PNVAETSGGS 



VRGGHWRWSP 



DQSYHSMSPL 
CIYCSGEICP 
KLTVQEFPGD 
LDLLGDFKHA 
FSKGDETQSD 



VPAEQESFAL 
SPPSGAPGIP 
ISRKLSVLRE 
SLSGRLARVE 
RYLPQDQLQD 



EHNSQPIiVAA 
GLTTHSNKTF 
QSETPTYSEC 
ALLKRNMMPN 
KWNPITGNRK 
LKKSEETSVY 
LLGARKKAFP 
EAEEEEEELP 
HSSDFLPPIR 
TSYSAYYNIS 
AQRGLLEDIN 
NALNSIDSEA 



KECSRMGGRR 
EETQEPPESP 
LEGRISPGQR 



TQRPKPIDQN 
FASKGLENSM 
CYNCRCHHHQ 
TSQSGREMAH 
EEGSSLASMP 



PQY 
GHLGSQPEQA 
VAKAELLNKL 
ANSALGEEVE 



DPNRTSRAAS 
SGGTRGRSIQ 
PLTASNTSLL 
PGQSSLGLNT 
NKELKASTAQ 
FQPMSSSCRE 
CCKPLHCGDF 
CIRCSVCYHN 
SKTSFSWATP 
HPLRSRAFSE 
KYRLFRAAQQ 
SCALNPEEVL 
QPPCYYGIGG 
KDQPEMAEIG 
ANLKAVCKSN 
KQQLTGQLAD 
KIKLGEEQLK 



GSLVQQATDC 



ELANQQPSAS 
NRRKSERFAT 
SSCKKPPSPR 
WWKAPDPSSS 
AGEDAILLPF 
LRRHPMDQSY 
DYHRTCSYSC 
PQHSALEDSS 
FHPCLEMPAL 
SHISLAPQST 
QKQQQQQQKQ 
EQPQPLSFGH LEGSRQGSQS 
LWRTSGQEAT 
LGEEEVDHEL 
EFEKYHLFVG 
AKELKEIIVDR 
CUiESLLLGP 



DKLFNKSMML 
DPEKAHAHCG 
ADRRKFFEES 
HSADQPYHAT 
SVQGALVHDP 
LAPGNTWKPR 
DLSSYRAISS 
RAWGQHRREL 



AQKKIQLIES 
DLDKWNLLL 
REKLVFGMVS 



1080 
1140 
1200 
1260 
1320 
1380 
1440 



Seq ID NO: 60 Nucleotide sequence: 
Nucleic Acid Accession #: NM_014705 

Coding sequence: 192.. 2489 (underlined sequences correspond t 



start and stop codons) 



I 

GGGAGAAGCT 
ATTATATGCA 
AATGCTTCCT 
GAGCGGGAAA 
GAGAGGTTGT 
TACGCTACAT 



TCCAGAACTT 
CAGAGCAGTA 
CCTCTTTGTA 
GATTTTATGG 
ATGACTACGA 
TCGCCATGCA 
TGCAGATATA 
TTCCGGACAA 
GACCATTTCA 
GAACGTCATT 
AGCGTGAAGT 
ATCAGCAGCT 
CCCTGACTAT 
ATCAAGAGGC 
TTGCACGATT 
TGCATGAGAA 
TCTTTGTGAT 
CTGTCCATTT 
GCCCAGATGG 
GATATTCTTC 
AAT CAGAAAG 
GTTCTACTCA 
CTCCTTTGTT 
AGAGACCATG 
ATCATATTGG 
CTTCACCAGC 



11 
I 

AGGAAAAAAT 
TGTGTTTGTA 
GGATTAGAAT 
CATGGCGGGA 
TAGATTACAG 
TCACAAACTC 
CCTCTTATAT 
CCCCATGCAA 
TGACAGAGGC 
TGAGAGTTAT 
TGACAAAATT 
AAAAAAATTT 
GAGGCTGGAA 



21 



CATCAAAAGC 
C AAAGG C AC A 
ATACTTGGTG 
GGTAGAAATG 
GAAGACTCTG 
GTGCCTGAAT 



TCCTAATGGA 
TACCAGGGTA 
CTCCTCACTG 
CTCTGATGAA 
CTCGGCTTCA 
GTCTGACAAA 
CAGTGCCATC 



I 

" GTCTTTGAGC 
TTTTATGACT 
TCCACTATTT 
AAGTGGCGTT 
AACTTCTATA 
TATGATCTGC 
GACGAGCTAC 
ACAGAATGGC 
AAATGTTGGG 
TATGACTACA 
ATGGACCAGC 
CCATTTTTCT 
GCCTTCCAAC 
CAGCCCGATG 
CCCATTCCAG 
TTCTATAAAG 
AAAGATAAAG 
CAGAGTTTGC 
AGTCCTCTGG 
ATTAGTCAGT 
GGAGTTATAG 
AAAGAATATA 
ATGCTTGAGC 
CAAGATATGA 
TTAGGGATAC 
AGCCCTCGTG 
ATTCCTAGAC 
TCCTCACAAG 



TGGATAATCT 
GGTCCCTATC 
TCATTAATTG 
AGACTGAACT 
ATCTCAAAGC 
TGGAATGGTC 



TTGTATATTT 
GAAAATCAAT 
CTAGTCTACT 
CTACTGTAAC 



AGAATGGCAT 
GAAACCTGAG 
AACGTCTTGA 
TAAGAAATAA 
AGAGAATGCT 



ACAGAACTTT 
TGATCGGCCC 
GCACCTGCAC 
TATCTTGTGC 



51 
I 

TGAAAATATG 
TTGCTTTGTC 
AAAGAAAATT 
TCGTCTAATG 
GAGATGTATA 
ACAGAAGCTG 
CTCAGGGAGT 
CTCACCATCA 



AAGACACACG 



CCTAATGTGA 
CACAAACATT 
TATCCAACAC 
TTGCCACGCA 
ACATCAGTAT 



TGAATCACAT 
AGAATGAATT 
CTGGCATCTC 
AAAATGCAAT 
GTCAGACAAG 
ATGCTGCAGT 
TCTTAAGTCA 
AGGCACAGAT 
GACCCCTTCA 
AGGAGTTCTC 
TGTGTAGAAA 
GCAGCCCGTT 
CTTCTGCTGA 
TGCAGCCAAG 
CAAGTTCTGC 
CCCGAGAAAA 
CTGTGGAGCC 
GTGACCCAAA 
CCCCCTCGCC 



ACCAGAGTTC 
GGAGTTTGTG 
GAACGAGTTC 
CCAGGCAGAA 
GGTCCTGCAG 
CTGGAAATTC 
CAAGAGTCTC 



TGAAGTGCTA 
ACAGATGCAG 
TAATGGTGGC 
CCCTGA^GAT 
TCTGGAATTT 



GCTCAGTATT 
AGAGAGGGTG 
CGCTATGACC 
TGGGTGGAGA 
GAAGTGGAAA 
GAAAATAAGA 
AATATTAATC 
GTTTCCAGGT 
GGGGAGAAAA 



CTCAGCACCT 
AAC-TTACCCA 
AGTAAGCAAT 
TCCATCTACC 
TCCATCGAG? 
CTCTTGCCTG 
TTCGCAGAGG 
TCTCTCTGCA 
TGCCGG3CGA 



GTTGACCAAT 
CAAGCCAGTC 
GCTTCTGTGA 
GCTGTCAACC 
ATTACAGGGC 
TCAAGCTTGA 
GCCAGAGCTT 
TCACCAAGAG 
ATGCTGTTTA 
CCTGAAAAAG 
TCTCCATTGA 



1200 
1260 
1320 



1560 
1620 
1680 



1860 
1920 
1980 
2040 
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AGGGCTCTGT 
CCAACTCCCC 
GCACGTCGGA 
TGCCAGTGCC 
AGACTCCGCC 
ACAGCCTCTC 
CGCGATCCAG 
CCCTGCCCCG 
TTGCCCGTTT 
CTTACTCAGC 
ATATTAAAAG 
AAAACGTTTC 
GTTCACTTTT 
TGTGTGACAA 
TCAGTGCACA 
TTTATCAGTG 
GGGGTTGCAA 
GTTGCTTTTG 
CTAAATCCCT 
TGCTTCAGAG 
ATTTTGCGCT 
TGCTGAAGTA 
TACAGACAGT 
TTAGCACGAT 
CACAAAATAA 
CCTGTCGTCT 
ATAGCATCTT 
CTTGTATGTA 
TCAATTGACA 
TGTTGGGAAG 
AGTGACCACT 
TGGTTATAGA 
TAATAAGAGA 
CAGTATGATT 
ATGTGCTTTT 
ATATGTTAAA 



GGTGCCCGTG 
CCCGTACAGC 
CATCCCCGTC 



CAAGGTCTCT 
ACAAAATAAG 
TCCTTCGATG 



ACCCCCTCTC 
GGCAGCTACA 
TTTGAAAATC 
CCGAGCTACG 
GTCTACGAGC 
ACGTCGGAGC 
AATGGGGCCC 
CAGTTATAAG 



GATAACACTT 
TATTTCTGTT 
AACAAAGCTT 
GGGCACATTA 
AGTGAGCAGT 
AGCAGAACAT 
TAAATATTAA 



TCATTTGGAA 
TTTTATATAG 
TGTTGTATCG 
TCCTAACACA 
CAGATCATCA 
AGCTTGGGAC 
CTGTTTGACT 
TACTACCCTG 
TTCCAATCAG 
TCCTTTCTTG 
TTAATGAGGC 
ATTAAATTAA 
TAAATGGCAA 
AAAATTCCTC 
TTCTCTTGAA 
AGGTCTAGAT 
GTAATGTATT 
ATAGATATGA 
CTCATTTTAG 
TTTTTATATT 
GGTTTTCTAT 
TACCATTATG 
GTTATAATTA 
AACAATTCTG 
TATTTATGTT 
TATCTTGCAA 
CGTGTTTGTA 
GCTTTAAAGA 
CCTTGATTTC 
TAAATCAGTG 
TGAATTTATC 
TTTAAGAGAT 
ATAAAATTAC 



AATGGAATTA 
AAAACGCCAG 
GTGGTAAATA 
TTTAATCTTA 
TTTTTACTGA 
GTGGTCCTTA 
GAATTAAAGT 
TGGAAATTGT 
GCTGTCTACA 
ACACCGTGGT 



CAGTGGAGTA 
GCAGTGGGAT 
AGGTGAATGA 
GCGGGGAGGA 
GGACTCTGCG 
CGCCCGCGCT 
GGAGGACTGA 
TCACTTTTCT 
AGAAGACATT 
AAACTTGCTT 
ATGTTAAATG 
GTGATAAAGA 
AAACCAATAC 
ATACTTGATA 



CCACTCGCCA 
TTCTTCTCTC 
ACAGTCGGCC 
GCCAGTGCGC 
GCGCCCCGTC 
GCCCCCCAAG 
CCCCGGCCCG 



TAGTGTAGGC 
ATTAAATATC 
AAGTATGGCT 
CTCCTTTTGT 



TCAAGCAGGC 
TTTGTTCTTG 
TTCGTAAAAT 
ATCTACTGTA 



ATAGTTTCCT 
ACAAATGACT 
TAGCTTAAGT 
AAGGACTTAT 
ACGACTCTCC 
GGAGGATTGC 
AGGGATGCTA 
TTATATCTTT 
ACTGTATTTT 
TTTAACCATG 
CTCTTAATGA 
AATAAATGTT 
TGTAAAAAAA 
TGCCAAATAC 
CCATATTGAC 



ATATAAAATT 
GTGCCCCATT 
GAAGAAAAAA 
AAAAGGCAAG 
ACTTTTACCA 
TGTAGACTTC 
ATGCCAGTTG 
GTTTTTTCTT 
AAATCATACT 



GAGCAAGACA 
AAACAAAATA 
TCTATTTTGT 
TTTCTTTTCA 
CAGAAGAAAT 
TGTCATT3AA 
AGCAAGAATT 
AAAAACTTAG 
GGCATTAACT 
GTGGAGTTTG 
TCTAGCTTGA 



GTTTTGCAGT 
ACCACATTCA 
AGTAAAACCT 
CTTTTCCTGT 
TTATTTGCTC 
TTGAATTTAT 
TCTGTTAGCC 
CAGATTAATC 



AACCATTCAG 
CAAGTATTTT 
AGTAAAAATA 
TTTACGTATG 
TTTACAATTT 
GTTCATAATT 
CAGAAATTTT 
TATATATATA 
TCCATTTAAA 
GTAATTTAAT 
TTGGAGCCAT 



CACATGTTCA 



CAGGTATATG 
TGTATATAAC 
AACGCAAACA 
GAAAAAGAAT 
TGAATGTCGG 
GAGAAAAGGA 
TTGAACTAGC 
CATATATATG 
TAAGATGACA 
AGATTTGTTG 
TTTTTAAAAA 
TATTTCTGAA 
TGTTGGTTGC 



GGACTCATCT 


2100 


AGCCGGTGCA 


2160 


CCCCTGCCGG 


2220 


AAGGAGAGCA 


2280 


CCGCTACCTC 


2340 




2400 




24S0 


GATGCATTCT 


2520 


ACTTTAATAA 


2580 


ATGTTGCACA 




GAATTTCATT 


2700 




2760 


AACGATACAA 


2820 






TAAGGCAAGG 


2940 


GTATACTTAA 


3000 


CTTCTTTAAG 


3060 


GAATTGGTAG 


3120 


AGGTGCAATT 




GTTTACTTGA 


3240 


CAAGAGCAAA 


3360 


TGAAAATGTA 


342Q 


TTCAGTCCTG 


3480 


TCTTAGCTGA 


3540 


ATATTGCAAC 


3600 


AAAAAAGCAC 


3660 


TTTATAACAC 


3720 


TTCCCTCTTT 


3780 


CCTGGAAGGC 


3840 


CTGTTGAGGC 


3900 


TGGOTTAATTT 


3960 


GTTGTTGATG 


4020 


ATTTGTACAT 


4 0 80 


AAAAATTAAT 





ACTAGTGCTT 
AAATTTTGGC 
GCAGCTACAT 
TGCTGAATGG 



I 



LLYDELLEWS 
ESYYDYRNLS 
RLBAFQQRML 
IKSFYKVNHI 
VEMSPLENAI 
FFVKEYILSH 
KSSLGIQEFS 
SSLSSQASAE 



11 
I 

YCNSSNGEW 
DRPLREFLTY 
KMRMMEASLY 
NEFPHAIAMQ 
WKFRYDRPFH 
EVLENKNQQL 
PEDGEKIARL 
ACMQASPVHF 
VSNITGQSES 



21 



31 



41 



51 



1 I I I 

RLQNFYKTEL NKEEMYIRYI HKLYDLHLKA QNFTEAAYTL 
PMQTEWQRKE HLHLTIIQNF DRGKCWENGI ILCRKIAEQY 
DKIMDQQRLE PEFFRVGFYG KKFPFFLRNK EFVCRGHDYE 
HANQPDETIF QAEAQYLQIY AVTPIPESQE VLQREGVPDN 



KTLISQCQTR QMQNINPLTM CLNGVIDAAV NGGVSRYQEA 
RELMLEQAQI LEFGLAVHEK FVPQDMRPLH KKLVDQFFVM 
PNGSPRVCRN SAPASVSPDG TRVIPRRSPL SYPAVNRYSS 
SDEVFNMQPS PSTSSLSSTK SASPNVTSEA PSSARASPLL 
SAIYPTPVEP SQRMLFNHIG DGALPRSDPN LSAPEKASPA 
QSFTPSPVEY HSPGLISNSP VLSGSYSSGI SSLSRCSTSE 
VPVPSYGGEE PVRKESKTPP PYSVYERTLR RPVPLPHSLS 
HLENGARRTD PGPRPRPLPR KVSQL 



: S2 Nucleotide secruence : 



3 start and stop codons) 



41 



51 



I I 1 I I I 

ATGGACCGAG GCCAGGGTAA GAGGGGCCGC GACGCCCGCA CTTGTTGCGG CC-CCGGGCGG 
GAAAGGGAGA CTGGACGATC TGAAGCCGGA GAGGAGGAGG GAGAGAGGCG GGCGGTGGGG 
CGGGGGCTGA GGAACGCTCG GAGGGGACTG GGAGACGCGG CGCTTATGCA AAGGTGCCTT 
CGGCTGCCGG GACAACCCGC CAGCAACCAG GTACAGCTCT CAGAGGTTCC ACAGAGGAAG 
CTCAGGGTCC GTGAATCTCC CAGTGTGGCA GAGAAAGTGA AACTTGGTCA CCGATGCCTG 
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GAACTGCTGG AGCAGCTGCT CCCAGAGCTC ACCGGGCTGC TCAGCCTCCT GGACCACGAG 360 

TACCTCAGCG ATACCACCCT GGAAAAGAAG ATGGCCGTGG CCTCCATCCT GCAGAGCCTG 420 

CAGCCCCTTC CAGCAAAGGA GGTCTCCTAC CTGTATGTGA ACACAGCAGA CCTCCACTCG 480 

GGGCCCAGCT TCGTGGAATC CCTCTTTGAA GAATTTGACT GTGACCTGAG TGACCTTCGG 540 

GACATGCCAG AGGATGATGG GGAGCCCAGC AAAGGAGCCA GCCCTGAGCT AGCCAAGAGC 500 

CCACGCCTGA GAAACGCGGC CGACCTGCCT CCACCGCTCC CCAACAAGCC TCCCCCTGAG S60 

GACTACTATG AAGAGGCCCT TCCTCTGGGA CCCGGCAAGT CGCCTGAGTA CATCAGCTCC 720 

CACAATGGCT GCAGCCCCTC ACACTCGATT GTGGATGGCT ACTATGAGGA CGCAGACAGC 780 

AGCTACCCTG CAACCAGGGT GAACGGCGAG CTTAAGAGCT CCTATAATGA CTCTGACGCA 840 

ATGAGCAGCT CCTATGAGTC CTACGATGAA GAGGAGGAGG AAGGGAAGAG CCCGCAGCCC 900 

CGACACCAGT. GGCCCTCAGA GGAGGCCTCC ATGCACCTGG TGAGGGAATG CAGGATATGT 960 

GCCTTCCTGC TGCGGAAAAA GCGTTTCGGG CAGTGGGCCA AGCAGCTGAC GGTCATCAGG 1020 

GAGGACCAGC TCCTGTGTTA CAAAAGCTCC AAGGATCGGC AGCCACATCT GAGGTTGGCA 1080 

CTGGATACCT GCAGCATCAT CTACGTGCCC AAGGACAGCC GGCACAAGAG GCACGAGCTG 1140 

CGTTTCACCC AGGGGGCTAC CGAGGTCTTG GTGCTGGCAC TGCAGAGCCG AGAGCAGGCC 1200 

GAGGAGTGGC TGAAGGTCAT CCGAGAAGTG AGCAAGCCAG TTGGGGGAGC TGAGGGAGTG 1260 

GAGGTCCCCA GATCCCCAGT CCTCCTGTGC AAGTTGGACC TGGACAAGAG GCTGTCCCAA 1320 

GAGAAGCAGA CCTCAGATTC TGACAGCGTG GGTGTGGGTG ACAACTGTTC TACCCTTGGC 1380 

CGCCGGGAGA CCTGTGATCA CGGCAAAGGG AAGAAGAGCA GCCTGGCAGA ACTGAAGGGC 1440 

TCAATGAGCA GGGCTGCGGG CCGCAAGATC ACCCGTATCA TTGGCTTCTC CAAGAAGAAG 1500 

ACACTGGCCG ATGACCTGCA GACGTCCTCC ACCGAGGAGG AGGTTCCCTG CTGTGGCTAC 1560 

CTGAACGTGC TGGTGAACCA GGGCTGGAAG GAACGCTGG? GCCGCCTGAA GTGCAACACT 1620 

CTGTATTTCC ACAAGGATCA CATGGACCTG CGAACCCATG TGAACGCCAT CGCCCTGCAA 1680 

GGCTGTGAGG TGGCCCCGGG CTTTGGGCCC CGACACCCAT TTGCCTTCAG GATCCTGCGC 1740 

AACCGGCAGG AGGTGGCCAT CTTGGAGGCA AGCTGTTCAG AGGACATGGG TCGCTGGCTC 1300 

GGGCTGCTGC TGGTGGAGAT GGGCTCCAGA GTCACTCCGG AGGCGCTGCA CTATGACTAC 1860 

GTGGATGTGG AGACCTTAAC CAGCATCGTC AGTGCTGGGC GCAACTCCTT CCTATATGCA 1920 

AGATCCTGCC AGAATCAGTG GCCTGAGCCC CGAGTCTATG ATGATGTTCC TTATGAAAAG 1980 

ATGCAGGACG AGGAGCCCGA GCGCCCCACA GGGGCCCAGG TGAAGCGTCA CGCCTCCTCC 2040 

TGCAGTGAGA AGTCCCATCG TGTGGACCCG CAGGTCAAAG TCAAACGCCA CGCCTCCAGT 2100 

GCCAATCAAT ACAAGTATGG CAAGAACCGA GCCGAGGAGG ATGCCCGGAG GTACTTGGTA 2160 

GAAAAAGAGA AGCTGGAGAA AGAGAAAGAG ACGATTCGGA CAGAGCTGAT AGCACTGAGA 2220 

CAGGAGAAGA GGGAACTGAA GGAAGCCATT CGGAGCAGCC CAGGAGCAAA ATTAAAGGCT 2280 

CTGGAAGAAG CCGTGGCCAC CCTGGAAGCT CAGTGTCGGG CAAAGGAGGA GCGCCGGATT 2340 

GACCTGGAGC TGAAGCTGGT GGCTGTGAAG GAGCGCTTGC AGCAGTCCCT GGCAGGAGGG 2400 

CCAGCCCTGG GGCTCTCCGT GAGCAGCAAG CCCAAGAGTG GGCAACTCTC TGAGGAAGAT 2460 

ACGCTCACCT CCAATGGTGC TCTCTCAGAG AGAACTTCTC TGACCTCATC TACACCAGGG 2520 
CTTCTCAACC CCAACACTAC TGACATTTTG GACCAGTAA 



Seq ID NO: 63 Protein sequence: 

Protein AccesGion #: fgenesh prediction 



1 11 21 31 ■ 41 51 

I I I I I I 

MDRGQGKRGR DARTCCGAGR ERETGRSEAG EEEGERRAVG RGLRNARRGL GDAALMQRCL 60 

RLPGQPASNQ VQLSEVPQRK LRVPESPSVA EKVKLGHRCL ELLEQLLPEL TGLLSLLDHE 120 

YLSDTTLEKK MAVASILQSL QPLPAKEVSY IiYVNTADLHS GPSFVESLFE EFDCDLSDLR 1B0 

DMPEDDGEPS KGASPELAKS PRLRNAADLP PPLPNKPPPE DYYEEALPLG PGKSPEYISS 240 

HNGCSPSHSI VDGYYEDADS SYPATRVNGE LKSSYNDSDA MSSSYESYDE EEEEGKSPQP 300 

RHQWPSEEAS MHLVRECRIC AFLLRKKRFG QWAKQLTVIR 3DQLLCYKSS KDRQPHLRLA 360 

LDTCSIIYVP KDSRHKRHEL RFTQGATEVL VLALQSREQA EEWLKVIREV SKPVGGAEGV 420 

EVPRSPVLLC KLDLDKRLSQ EKQTSDSDSV GVGDNCSTLG RRETCDHGKG KKSSLAELKG 4B0 

SMSRAAGRKI TRIIGFSKKK TLADDLQTSS TEEEVPCCGY LNVLVNQGWK ERWCRLKCNT 540 

LYFHKDHMDLi RTHVNAIALQ GCEVAPGFGP RHPFAFRILR NRQEVAILEA SCSEDMGRWL 600 

GLLLVEMGSR VTPEALHYDY VDVETLTSIV SAGRNSFLYA RSCQNQWPEP RVYDDVPYEK 660 

MQDEEPERPT GAQVKRHASS CSEKSHRVDP QVKVKRHASS ANQYKYGKNR AEEDARRYLV 720 

EKEKLEKEKE T1RTELIALR QEKRELKEAI RSSPGAKLKA LEEAVATLEA QCRAKEERRI 780 

DLELKLVAVK ERLQQSLAGG PALGLSVSSK PKSGQLSEED TLTSNGALSE RTSLTSSTPG 840 
LLNPNTTDIL DQ 

Seq ID NO: 64 Nucleotide sequence: 
Nucleic Acid Accession #: NM_004126.1 

Coding sequence: 108-129 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I 1 I I I 

GGCACGAGCT CGTGCCGGCC TTCAGTTGTT TCGGGACGCG CCGAGCTTCG CCGCTCTTCC 60 

AGCGGCTCCG CTGCCAGAGC TAGCCCGAGC CCGGTTCTGG GGCGAAAATG CCTGCCCTTC 120 

ACATCGAAGA TTTGCCAGAG AAGGAAAAAC TGAAAATGGA AGTTGAGCAG CTTCGCAAAG 180 

AAGTGAAGTT GCAGAGACAA CAAGTGTCTA AATGTTCTGA AGAAATAAAG AACTATATTG 240 

AAGAACGTTC TGGAGAGGAT CCTCTAGTAA AGGGAATTCC AGAAGACAAG AACCCCTTTA 300 

AAGAAAAAGG CAGCTGTGTT ATTTCATAAA TAACTTGGGA GAAACTGCAT CCTAAGTGGA 360 

AGAACTAGTT TGTTTTAGTT TTCCCAGATA AAACCAACAT GCTTTTTAAG GAAGGAAGAA 420 
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TGAAATTAAA AGGAGACTTT CTTAAGCACC ATATAGATAG GGTTATGTAT AAAAGCATAT 
GTGCTACTCA TCTTTGCTCA CTATGCAGTC TTTTTTAAGA GAGCAGAGAG TATCAGATGT 
ACAATTATGG AAATAAGAAC ATTACTTGAG CATGACACTT CTTTCAGTAT ATTGCTTGAT 
GCTTCAAATA AAGTTTTGTC TT 



NP_004117 



I I I 

MPALHIEDLP EKEKLKMEVE ~ 
KNPFKEKGSC VIS 



Seq ID NO: 66 Nucleotide aecruence : 
Nucleic Acid Accession #: NM_003842.1 

Coding sequence: 1-123 6 (underlined sequences correspond t 



: and stop codons) 



CTAGCTCCCC 
TGTCCACCTG 
CAGGACTATA 
TCAGGTGAAG 
GAAGGCACCT 
CCCAGAGGGA 
AAAGAATCAG 
TTTGTTTGCA 
GGTGGTGGTG 
AATGTCCTCA 
GAAGTCCAGG 



AATGAAGGTG 
CCCTTTGACT 
GTGGCTAAAG 
GTCAACAAAA 
GGAGAGAGAC 
TAT C TAG AAG 



11 
I 

GGGGACAGAA 
AGGCGCGGGG 
CGGTCCTGCT 
AGCAGAGAGC 
GACACCATAT 
GCACTCACTG 
TGGAGCTAAG 
TCCGGGAAGA 
TGGTCAAGGT 
GCATCATCAT 
AGTCTTTACT 
GGGACCCTGA 
ATGAGATCGT 
AGCCAGCAGA 
AACCGGCAGA 
ATCCCACTGA 
CCTGGGAGCC 
CTGAGGCAGC 
CCGGGCGAGA 
TTGCCAAGCA 
GTAATGCAGA 



CGCCCCGGCC 
AGCCAGGCCT 
GTTGGTCTCA 
GGCCCCACAA 



GAATGACCTC 
TCCCTGCACC 
AGATTCTCCT 



GTGGAAGAAA 



GAGTATCTTG 
GCCAACAGGT 
AGCTGAAAGG 
GACTCTGAGA 
GCTCATGAGG 
GGGCCACAGG 



GCTTCGGGGG 
GGGCCCCGGG 
GCTGAGTCTG 
CAAAAGAGGT 
GGTAGAGATT 
CTTTTCTGCT 
ACGACCAGAA 
GAGATGTGCC 
ACACCCTGGA 
GTTGCAGCCG 
GTCCTTCCTT 
AGAAGCTCAC 
CAGCCCACCC 
GTCAACATG? 
TCTCAGAGGA 
CAGTGCTTCG 
AAGTTGGGCC 
GACACCTTGT 
CACACCCTGC 
GACCACTTGT 
TCCTAA 



I 

CCCGGAAAAG 
TCCCCAAGAC 
CTCTGATCAC 
CCAGCCCCTC 
GCATCTCCTG 
TGCGCTGCAC 
ACACAGTGTG 



51 

I 

GCACGGCCCA 
CCTTGTGCTC 
CCAACAAGAC 
AGAGGGATTG 
CAAATATGGA 
CAGGTGTGAT 
TCAGTGCGAA 
CACAGGGTGT 

GTGACATCGA 
TAGTCTTGAT 
ACCTGAAAGG 
AACGACCTGG 
AGGTCCCTGA 
TGTCCCCCGG 
GGAGGCTGCT 
ATGACTTTGC 
TCATGGACAA 
ACACGATGCT 
TGGATGCCTT 
TGAGCTCTGG AAAGTTCATG 



GCAGGAAATG 
GGAGTCAGAG 
GGTTCCAGCA 
AGACTTGGTG 
TGAGATAAAG 
GATAAAGTGG 



1020 
1080 
1140 
1200 



1 11 21 31 41 51 

I I I I I I 

MEQRGQNAPA ASGARKRHGP GPREARGARP GPRVPKTLVL WAAVLLLVS AESALITQQD 
LAPQQRAAPQ QKR3SPSEGL CPPGHHISED GRDCISCKYG QDYSTHWNDL LFCLRCTRCD 
SGEVELSPCT TTRNTVCQCE EGTFREEDSP EMCRKCRTGC PRGMVKVGDC TPWSDIECVH 
KESGIIIGVT VAAWLIVAV FVCKSLLWKK VLPYLKGICS GGGGDPERVD RSSQRPGAED 
NVLNEIVSIL QPTQVPEQEM EVQEPAEPTG VNMLSPGESE HLLEPAEAER SQRRRLLVPA 
NEGDPTETLR QCFDDFADLV PFDSWEPLMR KLGLMDNEIK VAKAEAAGHR DTLYTMLIKW 
VNKTGRDASV HTLLDAIiETL GERLAKQKIE DHLLSSGKFM YLEGNADSAM S 



Coding sequence : 



I 



GGCACCATCT 
TGTCGCTGCC 
GGTTACACTG 
GCTGAGACGT 
TGCGAACACG 
GGTCTCAGCT 
ATGAACGGGG 
CCGCAGGACA 
TGCCAGGCTA 
AGTCTTTGTC 
GCCATCGCCT 
AACTGCTCTG 
TGTGCCCATG 



AGTGCTCCTG 
CGCATGGGCC 
CCAGCGGCCT 
CTCCTGACAC 
GCTCACCCAT 
TGCCCTGCCC 
AGGCAGTCTG 



I 

CTGCCCAGAG 
CCTCTGTGAC 
CCGGGAGGAG 
CCCGGACGCC 
GGACCGCTGC 
CTGCACCTGC 
CCTGCCGGGC 



I 

GGCTTTCACG 
CGATTCACTG 
TGCCCGGTGG 
CGTTGCTTCC 
ACGGATCGCC 



GACCCAACTG 



TGGGCGGGCC 
GAGCACTGTC 
GCGCCGGGTT 
AACTGTTCTG 
TGCGTCTGCA 
TGGGGCTTCA 
ACTGGAGCCT 



TCTGCCCCGA 
ACAGCCTCAG 
TCCACTGCAA 
TCTGCCTGCA 



CACGCTGCTC 
AGGAAGGTTG 
GTTGCAATGC 
GTACCTGCAC 



51 
I 

CTCCCAGGAA 
CTGCGCTCCG 
GCAGGACTGT 
CGCATGTCTG 
CGGCTTCTAC 
CTGCCACCCG 
CGAGAGCTGC 
CGGTGGCGTC 
TCACTGTGCT 
ATGTGAAAAT 
GCAGCGTGGT 
CAGCTGCCAG 
CCCTGGGTGG 
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CATGGGGCCC 
CGCTGTGACT 
GCTGGCTGGA 
TGTAGCAACA 
GTGTGTGCAC 
GGCAAACGCT 



ACTGCCAGCT 
GTGACCACTC 
TGGGTGCCCG 



GCCCTGTCCG 



CCGGATTCCG 
GTGTGCCCTG 
GCCTGGCTGG 



GAGACTGGGG 
CAGGAGCCCT 
ATTGGCATTG 
CGGCACTGGC 
CTGGACGGCT 
AACCCCAGCT 
CCAGGCCCGC 
GATAACCACA 
CTGGACAGGG 
GGCCCATTCT 
CTGAGCAGTG 
CGGGAGAGCA 
CCTCAGTTCT 
TACGAGCAGC 
CCTCCGGGCC 
TATGACTTGC 



CCTGTGTATG 
TTACTGTGAT 
CAGTGCTGGG 
AAAAAGGCAA 



ACCACACCCT 
TCTTTGCCAG 
CCACCCTGCC 



CTGCCACCTG 
CAAGAATGGG 
GGGCCCCTCC 
CAAGTGCGCT 
CTGGACAGGC 
CCAGCCATGC 
TCCCCCAGGG 
GCCGACCACT 
GTCCCTTGTG 
GGAGCACCAC 
CATGCCAGAT 
GTCGCAGTGC 



AAGGGGCAGT 
GACGCTGTTC 
TCCTGCCCTG 
GGCACCTGTC 
TGCCAGAGAT 
AACCACTCCT 
CCCGACTGCT 
CAGTGTGGTC 
CACAGTGGTG 



TTGGAGAAGG TTGTGCCAGT 



CACGTGGCTG 
GTCCCTCCGA 
TCCCCAAACC 



TCCCTGAGAA 
CCTGTCAGCC 
TCTGCCACCC 
CCCAGCGCTG 
CTGGAGAAAA 
CACCTTGCAG 
ATAACTCGCT 
TGGCACTGTT 



GGGAGTCAAC 
TGGCAACTGC 
TGGCCGCTAT 
CTCGAACGGG 
CCCTCTGGGG 
GTGCCACCCA 
GATTGGAATC 
GGGTGCAGTG 
CATTGGCTAT 



TGCTGACTGG 
CCTGGACCGA 
GCTCATCTCT 



GCTACATGGA GATGAAAGGC 
GGGACAGCCA GAGGCGGCGG 
CCAGCCCCCT GATCCATGAC 
TACCCCCCGG CCACTATGAC 
CTCCAGTACG GCATCCCCCA 



AAGCACCGCC 
AGCTACAGCT 
GAAGAGGAGC 
CGGGACCTGC 
CCTCCCTCAG 
CAACCCCAGC 
CGAGACTCTG 
TCACCCAAGA 
TCACCTCCAC 



GGGAGCCCCC 
ATAGCTACAG 
TCGGGGCCAG 
CCAGCTTGCC 
GATCTCCCCC 
CACAGAGAGA 
TGGGCTCCCA 



CTACTACTCC 
TAACAAGGTT 
CCAAGGGCAT 
TCCAGGGCCT 
CAATGGCCCA 



TTCGACGCCA 



AGGGGGCCCC 
CAGGCAGCCT 
CAGTGGCACC 
GCCCCCTCTG 
CCCTGGACAT 
GGACCGTTGA 



1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



1980 
2040 
2100 
2160 



Protein Accession # : 



GTICSLPCPE 
AETCDCAPDA 
MNGECSCLPG 
SLCPPDTYGV 
CAHEAVCSPQ 
AGWMGARCHL 



RCFPANGACL CEHGFTGDRC 



RFTGQCRCAP 
TDRLCPDGFY 
EHCLCLHGGV 



ETGACVCPPG I 



PGPLFASLQN 
GPFYNKGLIS 
PQFWDSQRRR 



EEELGASVAS 
QPQPQRDSGT 
SPPLRRQDR 



AIACSPIDGE 
HGAHCQLPCP 
CSNTCTCKNG 
TCYCLAGWTG 
QEPFTVMPTT 
LDGSEYVMPD 
DNHTTLPADW 
LSSENPYATI 
YEQPSPLIHD 



KGQFGEGCAS 
GTCLPENGNC 
PDCSQRCPL3 



VPPSYSHYYS 
KHRREPPPGP 
RDLPSLPGGP 
RDSVGSQPPL 



APGYTGPHCA 
WGFSCNASCQ 
DPVHGRCQCQ 
CQRSCQPGRY 
QCGPGEKCHP 
VALVALFIGY 
SPNPPPPNKV 
SYSYSYSNGP 
PPSGSPPRQP 



NC3VPCPPGT V 
RCDCDHSDGC I 
VCAPGFRGPS C 
TFGANCSQPC C 
IGIAVLC 
NPSYHTLSQC £ 
LDRGSSRLDR £ 
RESSYMEMKG I 
PPGLPPGHYD SPKNSHIPGH 720 



Seq ID NO: 70 Nucleotide sequence: 
Nucleic Acid Accession #: NM_005458 

Coding sequence: 1..2826 (underlined sequences correspond t 



start and stop codons) 



I 

ATGGCTTCCC 
GCGCGCCTGC 
GGCTGGGCGC 
CTCATGCCGC 
GTGGAACTGG 
CTGCGGCTCT 



ATCATTGCAG 
CCTGTTCTAG 
GCGGTGAATC 
CTGACGCAAG 
GGCGAGGACA 
AAAAAGCTGA 
GCAAAAGTGT 
ATTCCGGGCT 
CGCTGCCTCC 
CCCCTGAGCT 
GAGTACAACA 
GGCATCTGGG 



AATGCCATGA 
GAGAGAATGG 
GAGTACAACG 
TCCGAACCAC 



TACTGCTACT 
GGGGCGCCCC 
TCACCAAGGA 
CCATCGAGCA 
ATGACACGGA 
GGCCGAACCA 
AGTCCCTCCA 
CCGATAAGAA 
CAGCCATTCT 
ACGTTCAGAG 
TTGAGATTTC 
AGGGGAATGA 
TCTGTTGTGC 
GGTACGAGCC 
GGAAGAATCT 
CCAAGCAGAT 
ACAAGCGGTC 
TCATCGCCAA 
GGATCCAGGA 
ACGAGACCAA 
GGACCATTAA 
CTGTGGCCGA 
CAAAAGACAA 



CATCATGGGC 180 



GATCCGCAAC 
GTGCGACAAC 
CTTGATGGTG 
AGGCTGGAAT 
AAAATACCCT 
GAAGTTGCTC 
GTTCTCTGAG 
AGACACCGAG 
TGTGCGGATC 
ATACGAGGAG 
TTCTTGGTGG 
GCTTGCTGCC 



GAGTCACTCC 
GCAAAAGGGT 
TTTGGAGGCG 



: CGCCGCCGCC ACCGCCGCCC 
CTCTGGCGCC 
CGCCGCTCTC 
GGCGCGGTGT 
TGCGCCCCTA 
TGAAAGCCTT 
TCTGTCCATC 
TTTCTTTTGC 
GGACCGTCCC 
AGTGGAAGCG 
ACCTGACTGG 
ACGATCCCTG 
AGTTTGACCA 
GTAGTAAATA 
ACACGGAAGC 
ACATTGGCGT 
CTCCACAGCA 



CTTCCTCGAC 
CTACGATGCA 
CGTCACATCC 
TGCAACCACG 
AT C AGACAAT 
CGTGGGCACG 
AGTTCTGTAT 



GACACTGCAG 
CTTCAACTAC 
CTTCTTCGGG 
ATTTACTCAA 



AGCTTCTCCA 
ATCCTTGGCC 
AACATGTATG 
GAGCAGGTGC 
ATGGAGGGCT 
TCAGGAAAGA 
CCCAGCAAGT 
AGGGCCATGG AGACACTGCA 
ACGGACCACA 
GTCACGGGTC 
TTTCAAGACA 
ATCATCAATG 
CTGGAGCAGC 



AAGTTGTATT 
GCAC-GGAGGT 
ACACCATCAG 
TGCGGAAGAT 



GAATATGGCA 
TCAGTGGATC 
CAACTCATCC 
GGATTTCGAG 
GTATGAGAGA 
CGCCTACGAT 
TGCCAGCAGC 
GATCATCCTC 
CCGGAATGGG 
GAAGGTGGGA 
GTTCCAAGGA 
CTCCCTACCT 



1080 
114 0 
1200 
1260 
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CTCTACAGCA TCCTCTCTGC CCTCACCATC CTCGGGATGA TCATGGCCAG TGCTTTTCTC 15 00 

TTCTTCAACA TCAAGAACCG GAATCAGAAG CTCATAAAGA TGTCGAGTCC ATACATGAAC 15S0 

AACCTTATCA TCCTTGGAGG GATGCTCTCC TATGCTTCCA TATTTCTCTT TGGCCTTGAT 1620 

GGATCCTTTG TCTCTGAAAA GACCTTTGAA ACACTTTGCA CCGTCAGGAC CTGGATTCTC IS 80 

ACCGTGGGCT ACACGACCGC TTTTGGGGCC ATGTTTGCAA AGACCTGGAG AGTCCACGCC 1740 

ATCTTCAAAA ATGTGAAAAT GAAGAAGAAG ATCATCAAGG ACCAGAAACT GCTTGTGATC 1800 

GTGGGGGGCA TGCTGCTGAT CGACCTGTGT ATCCTGATCT GCTGGCAGGC TGTGGACCCC 18S0 

CTGCGAAGGA CAGTGGAGAA GTACAGCATG GAGCCGGACC CAC-CAGGACG GGATATCTCC 1920 

ATCCGCCCTC TCCTGGAGCA CTGTGAGAAC ACCCATATGA CCATCTGGCT TGGCATCGTC 1930 

TATGCCTACA AGGGACTTCT CATGTTGTTC GGTTGTTTCT TAGCTTGGGA GACCCGCAAC 2040 

GTCAGCATCC CCGCACTCAA CGACAGCAAG TACATCGGGA TGAGTGTCTA CAACGTGGGG 2100 

ATCATGTGCA TCATCGGGGC CGCTGTCTCC TTCCTGACCC GGGACCAGCC CAATGTGCAG 21S0 

TTCTGCATCG TGGCTCTGGT CATCATCTTC TGCAGCACCA TCACCCTCTG CCTGGTATTC 2220 

GTGCCGAAGC TCATCACCCT GAGAACAAAC CCAGATGCAG CAACGCAGAA CAGGCGATTC 22 80 

CAGTTCACTC AGAATCAGAA GAAAGAAGAT TCTAAAACGT CCACCTCGGT CACCAGTGTG 2340 

AACCAAGCCA GCACATCCCG CCTGGAGGGC CTACAGTCAG AAAACCATCG CCTGCGAATG 24 00 

AAGATCACAG AGCTGGATAA AGACTTGGAA GAGGTCACCA TGCAGCTGCA GGACACACCA 24 SO 

GAAAAGACCA CCTACATTAA ACAGAACCAC TACCAAGAGC TCAATGACAT CCTCAACCTG 2520 

GGAAACTTCA CTGAGAGCAC AGATGGAGGA AAGGCCATTT TAAAAAATCA CCTCGATCAA 25 80 

AATCCCCAGC TACAGTGGAA CACAACAGAG CCCTCTCGAA CATGCAAAGA TCCTATAGAA 2S40 

GATATAAACT CTCCAGAACA CATCCAGCGT CGGCTGTCCC TCCAGCTCCC CATCCTCCAC 2 700 

CACGCCTACC TCCCATCCAT CGGAGGCGTG GACGCCAGCT GTGTCAGCCC CTGCGTCAGC 27S0 

CCCACCGCCA GCCCCCGCCA CAGACATGTG CCACCCTCCT TCCGAGTCAT GGTCTCGGGC 2 820 
CTGTAA 



Seq ID NO: 71 Protein sequence: 
Protein Accession #: NP_005449 

1 11 21 31 41 51 

I I I I I I 

MASPRRSGQP GRPPPPPPPP ARLLLLLLLP LLLPLAPGAW GWARGAPRPP PSSPPLSIMG SO 

LMPLTKEVAK GSIGRGVLPA VEIiAIEQIRN ESLLRPYFLD LRLYDTECDN AKGLKAFYDA 120 

IKYGPNHLMV FGGVCPSVTS I I AESLQGWN LVQLSFAATT PVLADKKKYP YFFRTVPSDN 180 

AVNPAIIiKLL KHYQWKRVGT LTQDVQRFSE VRNDLTGVLY GEDIEISDTE SFSNDPCTSV 240 

KKLKGNDVRI ILGQFDQNMA AKVFCCAYEE NMYGSKYQWI IPGWYEPSWW EQVHTEANSS 3 00 

RCLRKNLLAA MEGYIGVDFE PhSSKQIKTI SGKTPQQYER 2YNNKRSGVG PSKFHGYAYD 3SO 

GIWVIAKTLQ RAMETLHASS RHQRIQDFNY TDHTLGRIIL NAKNETNFFG VTGQWFRNG 420 

ERMGTIKFTQ FQDSREVKVG EYNAVADTLE IINDTIRFQG SEPPKDKTII LEQLRKISLP 480 

LYSILSALTI LGMIMASAFL FFNIKNRNQK LIKMSSPYMN NLIILGGKLS YAS1FLFGLD 540 

GSFVSEKTFE TLCTVRTWID TVGYTTAFGA MFAKTWRVHA IFKNVKMKKK IIKDQKLLVI 600 

VGGMLLIDLC ILICWQAVDP LRRTVEKYSM EPDPAGRDIS IRPLLEHCEN THMTIWLGIV 660 

YAYKGLLMLF GCFLAWETRN VSIPALNDSK YIGMSVYNVG IMCIIGAAVS FLTRDQPNVQ 720 

FCIVALVIIF CSTITLCLVF VPKLITLRTN PDAATQNRRF QFTQNQKKED SKTSTSVTSV 780 

NQASTSRLEG LQSENHRLRM KITELDKDLE EVTMQLQDTP EKTTYIKQNH YQELNDILNL 840 

GNFTESTDGG KAILKNHLDQ NPQLQWNTTE PSRTCKDP1E DINSPEHIQR RLSLQLPILH 900 
HAYLPSIGGV DASCVSPCVS PTASPRHRHV PPSFRVMVSG L 

Seq ID NO: 72 Nucleotide sequence: 
Nucleic Acid Accession #: NM_005795 

Coding sequence: 522-1940 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GCACGAGGGA ACAACCTCTC TCTCTSCAGC AGAGAGTG'TC ACCTCCTGCT TTAGGACCAT SO 

CAAGCTCTGC TAACTGAATC TCATCCTAAT TGCAGGATCA CATTGCAAAG CTTTCACTCT 120 

TTCCCACCTT GCTTGTGGGT AAATCTCTTC TGCGGAATCT CAGAAAGTAA AGTTCCATCC 180 

TGAGAATATT TCACAAAGAA TTTCCTTAAG AGCTGGACTG GGTCTTGACC CCTGGAATTT 240 

AAGAAATTCT TAAAGACAAT GTCAAATATG ATCCAAGAGA AAATGTGATT TGAGTCTGGA 3 00 

GACAATTGTG CATATCGTCT AATAATAAAA ACCCATACTA GCCTATAGAA AACAATATTT 360 

GAATAATAAA AACCCATACT AGCCTATAGA AAACAATATT TGAAAGATTG CTACCACTAA 420 

AAAGAAAACT ACTACAACTT GACAAGACTG CTGCAAACTT CAATTGGTCA CCACAACTTG 4 80 

ACAAGGTTGC TATAAAACAA GATTGCTACA ACTTCTAGTT TATGTTATAC AGCATATTTC 540 

ATTTGGGCTT AATGATGGAG AAAAAGTGTA CCCTGTATTT TCTGGTTCTC TTGCCTTTTT 600 

TTATGATTCT TGTTACAGCA GAATTAGAAG AGAGTCCTGA GGACTCAATT CAGTTGGGAG 6S0 

TTACTAGAAA TAAAATCATG ACAGCTCAAT ATGAATGTTA CCAAAAGATT ATGCAAGACC 720 

CCATTCAACA AGCAGAAGGC GTTTACTGCA ACAGAACCTG GGATGGATGG CTCTGCTGGA 780 

ACGATGTTGC AGCAGGAACT GAATCAATGC AGCTCTGCCC TGATTACTTT CAGGACTTTG 840 

ATCCATCAGA AAAAGTTACA AAGATCTGTG ACCAAGATGG AAACTGGTTT AGACATCCAG 900 

CAAGCAACAG AACATGGACA AATTATACCC AGTGTAATGT TAACACCCAC GAGAAAGTGA 960 

AGACTGCACT AAATTTGTTT TACCTGACCA TAATTGGACA CGGATTGTCT ATTGCATCAC 1020 

TGCTTATCTC GCTTGGCATA TTCTTTTATT TCAAGAGCCT AAGTTGCCAA AGGATTACCT 1080 

TACACAAAAA TCTGTTCTTC TCATTTGTTT GTAACTCTGT TGTAACAATC ATTCACCTCA 1140 

CTGCAGTGGC CAACAACCAG GCCTTAGTAG CCACAAATCC TGTTAGTTGC AAAGTGTCCC 1200 

AGTTCATTCA TCTTTACCTG ATGGGCTGTA ATTACTTTTG GATGCTCTGT GAAGGCATTT 1260 

ACCTACACAC ACTCATTGTG GTGGCCGTGT TTGCAGAGAA GCAACATTTA ATGTGGTATT 1320 
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ATTTTCTTGG CTGGGGATTT CCACTGATTC 
TATATTACAA TGACAATTGG TGGATCAGTT 
GCCCAATTTG TGCTGCTTTA CTGGTGAATC 
TCATCACCAA GTTAAAAGTT ACACACCAAG 
GAGCTACTCT TATCTTGGTG CCATTGCTTG 
CTGAAGGAAA GATTGCAGAG GAGGTATATG 
AGGGTCTTTT GGTCTCTACC ATTTTCTGCT 
GAAGAAACTG GAATCAATAC AAAATCCAAT 
TTCGTAGTGC GTCTTACACA GTGTCAACAA 
GTCCTAGTGA ACACTTAAAT GGAAAAAGCA 
CAGAAAATTT ATATAATTGA AAATAGAAGG 
AACTCAAGGA CTTGGACCCA TGACTCTGTA 
GGGAATGTCA TAAAGAAGAG CCTTCACATG 
ATCCAGCTCT ATGTGGGAAA AAAGAAATCC 
CACTATGCCT GATGTGACGC TACTAACCTG 
ACAATCAACT TTTCTGAGCT GGTGTAAGCC 
AAATGGCTGT AAAACTAAAC ATACATGTTG 
GACCTAGCTA AGGTCTATAA ACATGAAGGG 
TCCCATCTTG ATTGGGGCAG TTGACTTTTT 
TAACTACCCT CTCAAATGGA CAATACCAGA 
CTATGAAAAG CAACTGAGTA CAATTGTTAT 
ATCTTGTGGC ATATCCATTG TGGAAACTGG 
TTCTATATCA TTAGGAAAAC ATCTTAGTTG 
TGTCTTACCA AACAGTGGGA GGGAATTCCT 
TTCTACTGTA TAAACAAATT AGCAATCATT 
TATTTTCTTG GAATTTTGTA AAAAGAAATT 
TTTTATTTTA TAGTCTCAAA T C AAATACAT 
TAATGCAACA ATGTGTGTAT GTTAATATCT 
AAATAGAGTC TGGAATGCTA TATTTGGTAA 
AGAAGTCTGT TTGAGAACTA AGAGAACAGA 
AAACACAAGG TCACTATTTT ACTGAATATA 
GGTGTGTTTG ACATATTTCT TTTTTCATTT 
TTTTAAACAA CTACTGTGAT AAATACCAAT 
AATATTACTT TACTGACTTT TACTATGTGA 
ATTCAAGAAA TATAAAAAAC TAGAAGGATA 
TTTAATAGAG CTACTGTATA TAATACAAAT 
AAAATTATTG TCAGATCTTA CTGAATTATT 
AACCTTGCTA ATGAATTAAA GTGAAATTTG 
CCGCTGAAAT CTCTAAAGAA CAAGAATGAC 
GTCATGGGTA TCTGTTTTTT AAGTGTGTCA 
CAT C AT AAGT TGTTTCTTAA GCTGTCAATA 
TCAAATTGCT AAGACAAATT ATCTAAATTC 
AGTACATTTA TAATTTATCT ATGCATGAAA 
ATAGCAAGCT GCCATAGAAA GGA 

Seq ID NO: 73 Protein sequence: 
Protein Accession #: NM_005795 



CTGCTTGTAT ACATGCCATT GCTAGAAGCT 13 8 0 

CTGATACCCA TCTCCTCTAC ATTATCCATG 1440 

TTTTTTTCTT GTTAAATATT GTACGCGTTC 15 00 

CGGAATCCAA TCTGTACATG AAAGCTGTGA 1560 

GCATTGAATT TGTGCTGATT CCATGGCGAC 1620 

ACTACATCAT GCACATCCTT ATGCACTTCC 1680 

TCTTTAATGG AGAGGTTCAA GCAATTCTGA 1740 

TTGGAAACAG CTTTTCCAAC TCAGAAGCTC 180 0 

TCAGTGATGG TCCAGGTTAT AGTCATGACT 1860 

TCCATGATAT TGAAAATGTT CTCTTAAAAC 1920 

ATGGTTGTCT CACTGTTTGG TGCTTCTCCT 1980 

GCCAGAAGAC TTCAATATTA AATGACTTTG 2040 

AAATTAGTAG TGTGTTGATA AGAGTGTAAC 2100 

TGGTTTGTAA TGTTTGTCAG TAAATACTCC 2160 

ACATCACCAA GTGTGGAATT GGAGAAAAGC 222 0 

AGTTCCAGCA CACCATTGAT GAATTCAAAC 2280 

GGCATGATTC TACCCTTATT CSCCCCAAGA 234 0 

AAAATTAGCT TTTAGTTTTA AAACTCTTTA 24 00 

TTTTTTCCCA GAGTGCCGTA GTCCTTTTTG 2460 

AGTGAATTAT CCCTGCTGGC TTTCTTTTCT 2520 

GATCTACTCA TTTGCTGACA CATCAGTTAT 258 0 

ATGAACAGGA TGTATAATAT GCAATCTTAC 264 0 

ATG CT AC AAA ACACCTTGTC AACCTCTTCC 2 700 

AGCTGTAAAT ATAAATTTTG TCCCTTCCAT 2 760 

TTATATAAAG AAAATCAATG AAGGATTTCT 2820 

GTGAAAAATG AGCTTGTAAA TACTCCATTA 28 80 

ACAACCTATG TAATTTTTAA AGCAAATATA 294 0 

GATACTGTAT CTGGGCTGAT TTTTTAAATA 3 000 

ATATTTTAAA GACAACCAGA TGCCAGCATC 3 06 0 

AACATCTATC ATAAGATATA TTTATTTTAA 312 0 

TTTGTTTTGA TAACTCATAC CTTAATAATA 318 0 

TGACAATGAA CTCACATTCT AATCCAGAAA 324 0 

CTGCTACTTT TATAGATTTT ACCCCATTAA 33 00 

AGATATATAG CTTTGGAAAT GTCCCAGGCT 3360 

CTATATATAC CATATACAAT GCTTTAATAT 342 0 

TAGGGAAATA CTTGAATATA TCATTGAGAA 34 80 

GTCAGACTTT ATTAAATAAA GATAGAAGAA 3540 

CATGGGATTC AGTTTCTCTA ATGTTATTTT 3600 

TTCAATTAGT AAAAGTCAAT TTTGGGAAAA 3 660 

ATCTGATTAA AATGGATGAA ACAAATTACT 372 0 

TGTCAATAGA TGGTGAGTTC AGAACTTATT 378 0 

GTAAGAATTA ACATATAGAA TGGTCTGGTC 3 84 0 

AAGTATTGTT TTGTTTGAAA CATGAATTTC 3 900 



1 11 21 31 41 51 

I I I I I I 

MLYSIFHLGL MMEKKCTLYF LVLLPFFMIL VTAELEESPE DSIQLGVTRN KIMTAQYECY 60 

QKIMQDPIQQ AEGVYCNRTW DGWLCWNDVA AGTESMQLCP DYFQDFDPSE KVTKICDQDG 120 

NWFRHPASNR TWTNYTQCNV NTHEKVKTAL NLFYLTIIGH GLSIASLLIS LGIFFYFKSL 18 0 

SCQRITLHKN LFFSFVCNSV VTIIHLTAVA NNQALVATNP VSCKVSQFIH LYLMGCNYFW 240 

MLCEGIYLHT LIWAVFAEK QHLMWYYFLG WGFPLIPACI HAIARSLYYN DNCWISSDTH 300 

LLYIIHGPIC AALLVNLFFL LHIVRVLITK LKVTHQAESN LYMKAVRATL ILVPLLGIEF 360 

VLIPWRPEGK IAEEVYDYIM HILMHFQGLL VSTIFCFFNG EVQAILRRNW NQYKIQFGNS 420 
FSKSEALRSA SYTVSTISDG PGYSHDCPSE HLNGKSIHDI ENVLLKPENL YN 

Seq ID NO: 74 Nucleotide sequence: 
Nucleic Acid Accession #: NM_000450.1 

Coding sequence: 117.. 1949 (underlined sequences correspond to start and stop c 

1 11 21 31 41 51 

I I I I I I 

CCTGAGACAG AGGCAGCAGT GATACCCACC TGAGAGATCC TGTGTTTGAA CAACTGCTTC 60 

CCAAAACGGA AAGTATTTCA AGCCTAAACC TTTGGGTGAA AAGAACTCTT GAAGTCATGA 12 0 

TTGCTTCACA GTTTCTCTCA GCTCTCACTT TGGTGCTTCT CATTAAAGAG AGTGGAGCCT 180 

GGTCTTACAA CACCTCCACG GAAGCTATGA CTTATGATGA GGCCAGTGCT TATTGTCAGC 240 

AAAGGTACAC ACACCTGGTT GCAATTCAAA ACAAAGAAGA GATTGAGTAC CTAAACTCCA 300 

TATTGAGCTA TTCACCAAGT TATTACTGGA TTGGAATCAG AAAAGTCAAC AATGTGTGGG 360 

TCTGGGTAGG AACCCAGAAA CCTCTGACAG AAGAAGCCAA GAACTGGGCT CCAGGTGAAC 420 

CCAACAATAG GCAAAAAGAT GAGGACTGCG TGGAGATCTA CATCAAGAGA GAAAAAGATG 480 

TGGGCATGTG GAATGATGAG AGGTGCAGCA AGAAGAAGCT TGCCCTATGC TACACAGCTG 540 

CCTGTACCAA TACATCCTGC AGTGGCCACG GTGAATGTGT AGAGACCATC AATAATTACA 600 
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CTTGCAAGTG 
CCCTGGAATC 
ACAATTCTTC 



GTGATGCTGT 
TCCCATGGAA 
AGAGCCTTCA 
TGACATGCAG 
CTGGAGAGTT 



TGACCCTGGC 
CCCTGAGCAT 
CTGCTCTATC 
GTCCTCTGGA 
GACAAATCCA 
CACAACCTGT 
GTGTACCTCA 
GGCCGTCCGC 



GTGAAGCTTT 
CTAGTGCTTC 
TTGTGTTGAA 



GGTGTGCTCA 
AGGAGGGATT 
CAGAAGAGGT 
TCAACATGAG 
AAGGATGGAC 
GCCTGCTACC 
CTGCTGCTGG 
TACGGAAAGC 
GCTACCAAAA 
GGAACTAGAG 
CTTCTTGCCT 
AATAGATCAA 
TCCTACTCTC 
CAAAGGTGAA 
CGCTGTAAAA 
CAGTGTTTCG 
GTTTAGAGGA 
CCCACAGGCA 
AGGGTTACTA 
GTTCTTAAGG 
TCATTCAATA 
GGTAAAAGTT 
TGTTTTAAAT 
AGAATAAAGT 
AGTAAAAACT 
CCACGATGAA 
CACACAAAGG 
GCTTTGCATT 
TATGGAAGAT 
TTTAACGAAT 
GCTCTGGAAG 
AACAATTCCA 
AGTAATTGCC 
CCATTAACTT 
AACGACAAAG 
TTAAAGGGGC 
ATGGAATACA 
GCATTAGAAA 
TTTAAATTAT 
TCAGACCTAT 



CCAGGTTGAA 
CCAGTGCACA 
TGGCAGTTTC 
GGGATCCAAA 
TGAAGCTGTG 
TTCCCCTATT 
TGAATTATAT 
TCCTTCCTGC 
CTGCAGTGGG 
GCTCAATGGC 
TACCTGTGAA 
ACTCTCCCTC 
AAAGAAATTT 
GCCTTCTTAC 
GGATACACTG 
ACTATGCCAG 
AGTCCAGCAG 
AGGATCAAGA 
GAGACCAAGA 
TCTTGGCACA 
ACAGCTGATT 
AAAAAATGAC 



TTCAGTGGAC 
GGAAGCCTGG 
AGCTGTGATA 
GAATGGAGTG 
GCCAATGGGT 
ACATTTGACT 
TCTGGGAATT 
CAGCCTCAGA 
TCATCCTGCA 
TGCACCACTC 
GCCTTGTCCA 



TCAAGTGTGA 
TTTGCAGTCA 
GGGGTTACCT 
CTCCTATTCC 
TCGTGGAATG 
GTGAAGAAGG 
GGGACAACGA 
ATGGCTCTGT 
ACTTCACCTG 



AAGTGTACAG 
AACTTCAGCT 
ATGGAGACCA 
GTGGTTGAGT 



ATGGGAGCCC 
TGTAAAGCTG 
CATTCCCCTG 



AGGCTCCAAT 
AGATGCGATG 
GGAGAATTCA 
GGATCAACTC 
CAAGTGGTAA 
GAGCCCGTGT 
TCTGCAGCTC 
GCTCCCACTG 
CTGACATTAG 



ATCCTTTAAG 
AAGTTAACAG 
ATGCCTTTAT 
GCAAGGACGG 
AAGTGTTGGC 
CTCTGAAATC 



ACCCCGAGCG 
CCAGCTGTGA 
GTGGCCCCAC 
CTGTCCACCA 
CCTACAAGTC 
AACTTGAGTG 
AATGTTCAAG 
TTGGCACTGT 
GGACATGTGG 
AGTCCAACAT 
CACCATTTCT 
GCAGCTGCCA 
TTCAAAAGAA 
AGACAGATAA 



CTCTTGTGCC 
CACATCTCAG 
CCTGGCAGTT 
GTGCAAGTTC 
AGCCACAGGA 
TCCCTTGGTA 
CCTCTGGCTT 



ATCCCAGTTT 
AATTGTCTTC 
GAGCAGGGTT 
GACAACGAGA 
GGTTTGGTGA 
TTCAGCTGTG 
GGACAATGGA 
CCGGGAAAGA 
GCCTGTCCTG 
CACTGGTCTG 



CGGAAATGCT 
TCAGACGGAA 
GTGCATCTGG 



TGCACAATTT 



CAAGTGTGGT 
ACTTATTCTA 
TTATTTCAAA 

TCTGAGTGTT 
GAATGGAAGG 
AAACTTCCAT 
TATAATTTTA 
CCTACAAAGA 
TTTAAATTCA 
GAAGATGTCT 
AGAGGAATGC 
AAGGAATCTC 
AAAGCTGCTC 
AGCATGTGTT 
CCAACAGTCA 
AGAAAAACTC 
GTGTTATTTT 
TTAGCTGTGT 
AACTTAAAAT 
TTGACATAAC 



ACACAGTTGC 
TAAAAATATT 
GGGTTGTTAA 
AATCACTTTC 
TTTTACTTGC 
AGGGACTTAA 
GATTACCCCC 



CCTTCAACTG 
TAATGAAGGG 
TCAGAATTCC 
ATTTTGTGGC 
TGTCATAAGA 
ATAACTTAAA 
TGGTGCAAAT 



TTTACTACAG 
TTTGTATATT 
GAGGCCAAAC 



TGTTTGTCAG 
CAATAGAAAC 
AATAGTTATT 
CTGTGTGAGC 
CAGTTTTCAG 
TAGCCTTGAG 



ATTGAATATA 
AAAACTTGTA 
TCATTGTTTA 
ATTGTCCCCT 
TTGTTTTTTG 
GTCAGATATT 
GTTTTGAACT 
TGTTGGAAAA 
ATGTGATATG 
TCACCATGTA 
CCCTATTTGT 
AAGCATTTAT 
TTGATCACTG 
GAGTGTGAGA 
GTTTCAGAGA 



AAAAGACTCA 
AAAGGATATT 
TTTTCTAACT 
TTTCTTTCTT 
ATGAATAATA 
AAAATGACAG 
CCTACTGAAT 
GATTCAGTGC 
TTATAATCTT 
AATGCTGTCA 



TCACCACTTC 
GTGTTCCCTT 
TTCTTCCAAG 
CTCCCTTGCT 
TTGCCCTTCA 
ATTATCCAGA 
ATGTTGAATG 
GCTCTGTGCG 
TTCTTAAAGA 
CCATACTTCT 
ACTATGATAT 



1200 
12S0 
1320 
1380 
1440 
1500 
1560 
1520 
1S80 
1740 

iaoc 

1860 
1920 



2100 
2160 
2220 
2280 
2340 

2460 
2520 
2580 
2640 
2700 



AGCAAGGCAT 



AAACAGAGAT 
TGGGAAATAA 
CTTTGAAATT 



TTTTCAGAAA 
AATAAAAGCA 
GAATACAGAA 
TAAACATAAT 
AAAGAGTCAT 
TTTCTTCTGT 

GCAATGAAAA 
ATCAAAACTC 
AGTTCTGGCT 
TCAGAACAGC 



GATGTTAACC 
AGAATTGGAG 
TATGTGGTTT 



AGATGGATGT 
TCTTGTATAT 
CTGGTAGATT 
ATGTTAGGGT 
AAGCAGATTT 
ATTCTCAGTC 
TCCTACACTT 
GAACACTGGC 
AGAGGTTCTT 



TGTAAATATT 
GTTTGAGTTT 
TATATTTA'TT 
ATGTGCTTAT 



TATGTAAACT 
TATTGAGAAT 
TAAGCTTATG 



Seq ID NO: 75 Prote: 
Protein Accession # 



I 

MIASQFLSAL 
SILSYSPSYY 
DVGMWNDERC 
TALESPEHGS 
ECDAVTNPAN 
AVTCRAVRQP 
VCEAFQCTAL 
EKPTCEAVRC 
WTEEVPSCQV 
SGIiLPTCEAP 
GSYQKPSYIL 



11 



I 

TLVIjLIKESG 
WIGIRKVNNV 
SKKKLALCYT 
LVCSHPLGNF 
GFVECFQNPG 
QNGSVRCSHS 



I 

AWSYNTSTEA 
WVWVGTQ.KPD 
AACTNTSCSG 
SYNSSCSISC 
SFPWNTTCTF 
PAGEFTFKSS 



31 
I 

MTYDEASAYC 
TEEAKNWAPG 
HGECVETINN 
DRGYLPSSME 
DCEEGFELMG 



41 



I 

QQRYTHLVAI 
EPNNRQKDED 
YTCKCDPGFS 
TMQCMSSGEW 
AQSLQCTSSG 
LQGPAQVECT 



DAVHQPPKGL 
VKCSSLAVPG 
TESNIPLVAG 



VRCAHSPIGE 



LSAAGLSLLT 



FTYKSSCAFS 
VFGTVCKFAC 
LAPFLMJLRK 



3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3S00 
3660 
3720 
3780 



51 

I 

QNKEEIEYLN 
CVEIYIKREK 
GLKCEQIVNC 
SAPIPACNW 
NWDNEKPTCK 
TQGQWTQQIP 
QCGPTGEWDN 
TQLECTSQGQ 
ARTCGATGHW 
ASSCQSLESD 



Seq ID NO: 76 Nuclei 



; and stop codons) 



215 



WO 02/079492 



PCT/US02/04915 



1 11 21 31 41 51 

I I I I I I 

CCCGACCCGT GCGAGGGCCA GGTCCGCGCC TGCCCCGCCA GGCGAAGCGA GGCGACCCGC SO 

GTGCGGCCAT GGCTTCGCTG CTGGGAGCCT ACCCTTGGCC CGAGGGTCTC GAGTGCCCGG 120 

CCCTGGACGC CGAGCTGTCG GATGGACAAT CGCCGCCGGC CGTCCCCCGG CCCCCGGGGG 180 

ACAAGGGCTC CGAGAGCCGT ATCCGGCGGC CCATGAACGC CTTCATGGTT TGGGCCAAGG 240 

ACGAGAGGAA ACGGCTGGCA GTGCAGAACC CGGACCTGCA CAACGCCGAG CTCAGCAAGA 3 00 

TGCTGGGAAA GTCGTGGAAG GCGCTGACGC TGTCCCAGAA GAGGCCGTAC GTGGACGAGG 3 SO 

CGGAGCGGCT GCGCCTGCAG CACATGCAGG ACTACCCCAA CTACAAGTAC CGGCCGCGCA 420 

GGAAGAAGCA GGCCAAGCGG CTGTGCAAGC GCGTGGACCC GGGCTTCCTT CTGAGCTCCC 4 80 

TCTCCCGGGA CCAGAACGCC CTGCCGGAGA AGAGAAGCGG CAGCCGGGGG GCGCTGGGGG 540 

AGAAGGAGGA CAGGGGTGAG TACTCCCCCG GCACTGCCCT GCCCAGCCTC CGGGGCTGCT 600 

ACCACGAGGG GCCGGCTGGT GGTGGCGGCG GCGGCACCCC GAGCAGTGTG GACACGTACC SSO 

CGTACGGGCT GCCCACACCT CCTGAAATGT CTCCCCXGGA CGTGCTGGAG CCGGAGCAGA 720 

CCTTCTTCTC CTCCCCCTGC CAGGAGGAGC ATGGCCATCC CCGCCGCATC CCCCACCTGC 780 

CAGGGCACCC GTACTCACCG GAGTACGCCC CAAGCCCTCT CCACTGTAGC CACCCCCTGG 840 

GCTCCCTGGC CCTTGGCCAG TCCCCCGGCG TCTCCATGAT GTCCCCTGTA CCCGGCTGTC 900 

CCCCATCTCC TGCCTATTAC TCCCCGGCCA CCTACCACCC ACTCCACTCC AACCTCCAAG 9S0 

CCCACCTGGG CCAGCTTTCC CCGCCTCCTG AGCACCCTGG CTTCGACGCC CTGGATCAAC 1020 

TGAGCCAGGT GGAACTCCTG GGGGACATGG ATCGCAATGA ATTCGACCAG TATTTGAACA 10 80 

CTCCTGGCCA CCCAGACTCC GCCACAGGGG CCATGGCCCT CAGTGGGCAT GTTCCGGTCT 1140 

CCCAGGTGAC ACCAACGGGT CCCACAGAGA CCAGCCTCAT CTCCGTCCTG GCTGATGCCA 1200 

CGGCCACGTA CTACAACAGC TACAGTGTGT CATAGAGCTG GAGGCGCCCC GTCCGGTCAG 12S0 

CCCTCGCGCC CTCTCCTTCT TGTGCCTTGA GTGGCAGAGG AGCCGTCCAG CCACACCAGC 1320 

TTTCCTCCCA CCGCTCAGGG CAGGGAGGTC TGAACTGCGG CCCCAGAGCC TTTGGCCTAA 13 80 

GCTGGACTCT CCTTATCCGA GTGCCGCCTC TATCCCCTTC CCCACGTTCC AGCCCCTGCA 1440 

GCCCACATTT TAAGTATATT CCTTCAAGTG AGTTTTCCTC CAGCCCCTGA GAGTTGCTGT 1500 

CTCCCAGTGG AATGTTCACT GACGTCTTTT CTTGGTAGCC ATCATCGAAA CTAATGGGGG 15 SO 

GACAGACTTG ATAGCCAAGG TCCCTTCTGG TCCAGTTTTC TGATTTAGGG TTCTCTCAAG 1S20. 

ATTAATAAAG GAAGATGGGG AAATTTGACT, CATTAATGAG CTCGCTAACC TACGATCTGG 16 80 

TGATAATTTT GTGTGCACAG CCCAAGGACC ACGAGGCTTT CTGCACTTTC TGCACCCCCT 1740 

TCCAAAGTGA CCACAAAATT TCAAAGGGAC TCATACAATT TGAGAAAAAA CAGTCAACCT 18 00 

GATTTGAGAA ATTA^CCAGT ATGGCTAACT ATATCACAGA AAATGGGATT GAGTTAAAAC I860 

TATTTTATTT TAAATATACA TTTTAAAGCA GTTCTTTTTT TTTGTTAATT TGTTTATTAT 1920 

ACACACACTT CAAGAGCCAC CGCGCCCAGC CTACATTTAT AAT7TTCATT CTC^TTTACC 1980 

TATAAAATTC AGTGTATTAG TTTCATTACA TAG G AG AAAT TATATTTCTA AACATTTTAT 2 04 0 
GATGTTTAAA AACAAAACAG GCTGTTGTAA AAAAAAAAAA AAAAAAAAA 



1 11 21 31 41 51 

I I I I 1 I 

MASLLGAYPW PEGLECPALD AELSDGQSPP AVPRPPGDKG SESRIRRPMN AFMVWAKDER SO 

KRLAVQNPDL HNAELSKMLG KSWKALTLSQ KRPYVDEAER LRLQHMQDYP NYKYRPRRKK 120 

QAKRLCKRVD PGFLLSSLSR DQNALPEKRS GSRGALGEKE DRGEYSPGTA LPSLRGCYHE 180 

GPAGGGGGGT PSSVDTYPYG LPTPPEMSPL DVLEPEQTPF SSPCQEEHGH PRRIPHLPGH 240 

PYSPEYAPSP LHCSHPLGSL ALGQSPGVSM MSPVPGCPPS PAYYSPATYH PLHSNLQAHL 3 00 

GQLSPPPEHP GPDALDQLSQ VELLGDMDRN EFDQYLNTPG HPDSATGAMA LSGHVPVSQV 35 0 
TPTGPTETSL ISVLADATAT YYNSYSVS 

Seq ID NO: 78 Nucleotide sequence: 
Nucleic Acid Accession #: XM_035787 

Coding sequence: 329.. 949 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

TGCCCCGCCC CGCTCCCCAG CGCCCCGGAA GTGATCTGTG GCGGCTGCTG CAGAGCCGCC SO 

AGGAGGAGGG TGGATCTCCC CAGAGCAAAG CGTCGGAGTC CTCCTCCTCC TTCTCCTCCT 120 

CCTCCTCCTC CTCCTCCAGC CGCCCAGGCT CCCCCGCCAC CCGTCAGACT CCTCCTTCGA 180 

CCGCTCCCGG CGCGGGGCCT TCCAGGCGAC AAGGACCGAG TACCCTCCGG CCGGAGCCAC 240 

GCAGCCGCGG CTTCCGGAGC CCTCGGGGCG GCGGACTGGC TCGCGGTGCA GATTCTTCTT 3 00 

AATCCTTTGG TGAAAACTGA GACACAAAAT GGCTGCAAAT AAGCCCAAGG GTCAGAATTC 3 SO 

TTTGGCTTTA CACAAAGTCA TCATGGTGGG CAGTGGTGGC GTGGGCAAGT CAGCTCTGAC 420 

TCTACAGTTC ATGTACGATG AGTTTGTGGA GGACTATGAG CCTACCAAAG CAGACAGCTA 4 80 

TCGGAAGAAG GTAGTGCTAG ATGGGGAGGA AGTCCAGATC GATATCTTAG ATACAGCTGG 540 

GCAGGAGGAC TACGCTGCAA TTAGAGACAA CTACTTCCGA AGTGGGGAGG GGTTCCTCTG 600 

TGTTTTCTCT ATTACAGAAA TGGAATCCTT TGCAGCTACA GCTGACTTCA GGGAGCAGAT 6S0 

TTTAAGAGTA AAAGAAGATG AGAATGTTCC ATTTCTACTG GTTGGTAACA AATCAGATTT 720 

AGAAGATAAA AGACAGGTTT CTGTAGAAGA GGCAAAAAAC AGAGCTGAGC AGTGGAATGT 780 

TAACTACGTG GAAACATCTG CTAAAACACG AGCTAATGTT GACAAGGTAT TTTTTGATTT 840 



216 



WO 02/079492 
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AATGAGAGAA ATTCGAGCGA GAAAGATGGA AGACAGCAAA 

GAGGAAAAGT TTAGCCAAGA GAATCAGAGA AAGATGCTGC 

ACTCCTTTCT TATCTTGACC ATACTAATAA ATATAATTTA 

TTAATTGACT GAAATTACTT TAACATTTTG GAAATTGTTG 
5 TTGGAACTGC AATGAAAGTC AAATTTACTT TAAAAAGAAA 

AGCAAAGTTC AACTTATTTC ATAATTGCCT ACATTTATCA 

AAGCTTGTGT TTCTTGGGCA GTCTTTCTTG AAATTGAAGA 

TGGGAGGAAA GGTGACTTCC TCTGGTGTTT ATTATAAAGC 

AAATGTCTTG GTCTTCTACT GCCTTGAAAA ATGACAATTG 
10 CCACTTTTTT TAACCATTAT TATGCAAAAT TTAGAAGAAA 

CATATAGTTA AACTGAGAGT AATTCATCTG TGAATCTGCT 

TTAGAAAAGT GGTGTAAACT TGTACATGGA ATTTTTTGAA 

GAAAAATATC TGGTTATATC ATTCTGGGTG TGTTCTTACT 

CCATGTGTCC TGGTGAGAAA ATATATGCCT GGCACAGCTT 
15 AAGTAACTGT CCGCTAGAAG TCTGTCCAAA TTTAAAATGT 

AAAATAAGAT TCCAGAGCTC TTTGATCGCT TTTAATAAAC 

AGGGCCAGCA TATATACTTG CAAGATAATT TTCAGCTGCA 

TTTGAATGAA CCCTCCTTTT CTCTGAGATT CTGGTCCCTG 

GTGAGCATGT AAGTGTTAAG TTTTTAATCT GGGAGCAGGG 
20 AGTGCTAATG CATTTTGCAG TAGAACGCTT CGGGAAAATA 

ATTTCTAAAT TTATATTCAT AAAGTTACAG TTTGATACAG 

TTTCTGTTTC TGTTTATAAT GAAGAACACT GTAGCTACAT 

CCATCAAACC TGGGTATAGT GCAGAAAACG TGGCACACAC 

TCACCATTGT GTGGTGTACC TGCTGGAAGA ATTCTAGCAT 
25 CAGTGGGAAA TATGCCACTG ACCGATTTTT TTTTTTTCCT 

AGTTGATTCA ACAAAGTATT TTTTTCTTTT TTCTCAGTCC 

TGTGTTCAGG CATTCCAGGT AACAGGTGTG TATGTAAAGT 

ACTCACTCTT TAGATATTTA CATCCAGCTT CTCATGTTAA 

TGAGATGTAC ATCTTTCATT TCGTATTTCT CATAGGCTAT 
30 TACCAATGTA ACACTGGCCA GCGGGCCCAG CAATCTCCAT 

TTTAACCAGG GGTCCTAACC ACTAACATTG TGACTTTGCT 

GGTACTGAGG TGCTATGAAG CCAACTGACA AAGATGCATC 

ACTACCCGAT TTGTTTATTT GCAATTTGAG CCATTTAAAG 

35 Seq ID NO: 79 Protein secruer.ce: 
Protein Accession #: XP 035787 

1 11 21 31 

I I I I 

40 MAANKPKGQN SLALHKVIMV GSGGVGKSAL TLQFMYDEFV 
BVQIDILDTA GQEDYAAIRD NYFRSGEGFL CVFSITEMES 
PFLLVGNKSD LEDKRQVSVE EAKNRAEQWN VNYVETSAKT 
EDSKEKNGKK KRKSLAKRIR ERCCIL 

45 

Seq ID NO: 80 Nucleotide sequence : 
Nucleic Acid Accession #: NM_003467 
Coding sequence: 89.. 1147 (underlined sequences correspond to start and stop codons) 

50 1 11 21 31 41 51 

I I I I I I 

GTTTGTTGGC TGCGGCAGCA GGTAGCAAAG TGACGCCGAG GGCCTGAGTG CTCCAGTAGC SO 

CACCGCATCT GGAGAACCAG CGGTTACCAT GGAGGGGATC AGTATATACA CTTCAGATAA 12 0 

CTACACCGAG GAAATGGGCT CAGGGGACTA TGACTCCATG AAGGAACCCT GTTTCCGTGA 180 

55 AGAAAATGCT AATTTCAATA AAATCTTCCT GCCCACCATC TACTCCATCA TCTTCTTAAC 240 

TGGCATTGTG GGCAATGGAT TGGTCATCCT GGTCATGGGT TACCAGAAGA AACTGAGAAG 30 0 

CATGACGGAC AAGTACAGGC TGCACCTGTC AGTGGCCGAC CTCCTCTTTG TCATCACGCT 360 

TCCCTTCTGG GCAGTTGATG CCGTGGCAAA CTGGTACTTT GGGAACTTCC TATGCAAGGC 420 

AGTCCATGTC ATCTACACAG TCAACCTCTA CAGCAGTGTC CTCATCCTGG CCTTCATCAG 480 

60 TCTGGACCGC TACCTGGCCA TCGTCCACGC CACCAACAGT CAGAGGCCAA GGAAGCTGTT 54 0 

GGCTGAAAAG GTGGTCTATG TTGGCGTCTG GATCCCTGCC CTCCTGCTGA CTATTCCCGA 60 0 

CTTCATCTTT GCCAACGTCA GTGAGGCAGA TGACAGATAT ATCTGTGACC GCTTCTACCC 660 

CAATGACTTG TGGGTGGTTG TGTTCCAGTT TCAGCACATC ATGGTTGGCC TTATCCTGCC 72 0 

TGGTATTGTC ATCCTGTCCT GCTATTGCAT TATCATCTCC AAGCTGTCAC ACTCCAAGGG 780 

65 CCACCAGAAG CGCAAGGCCC TCAAGACCAC AGTCATCCTC ATCCTGGCTT TCTTCGCCTG 840 

TTGGCTGCCT TACTACATTG GGATCAGCAT CGACTCCTTC ATCCTCCTGG AAATCATCAA 90 0 

GCAAGGGTGT GAGTTTGAGA ACACTGTGCA CAAGTGGATT TCCATCACCG AGGCCCTAGC 960 

TTTCTTCCAC TGTTGTCTGA ACCCCATCCT CTATGCTTTC CTTGGAGCCA AATTTAAAAC 1020 

CTCTGCCCAG CACGCACTCA CCTCTGTGAG CAGAGGGTCC AGCCTCAAGA TCCTCTCCAA 1080 

70 AGGAAAGCGA GGTGGACATT CATCTGTTTC CACTGAGTCT GAGTCTICAA GTTTTCACTC 114 0 

CAGCTAACAC AGATGTAAAA GACTTTTTTT TATACGATAA ATAACTTTTT TTTAAGTTAC 12 00 

ACATTTTTCA GATATAAAAG ACTGACCAAT ATTGTACAGT TTTTATTGCT TGTTGGATTT 12 60 

TTGTCTTGTG TTTCTTTAGT TTTTGTGAAG TTTAATTGAC TTATTTATAT AAATTTTTTT 1320 

TGTTTCATAT TGATGTGTGT CTAGGCAGGA CCTGTGGCCA AGTTCTTAGT TGCTGTATGT 1380 

75 CTCGTGGTAG GACTGTAGAA AAGGGAACTG AACATTCCAG AGCGTGTAGT GAATCACGTA 1440 

AAGCTAGAAA TGATCCCCAG CTGTTTATGC ATAGATAATC TCTCCATTCC CGTGGAACGT 1500 



217 



GAAAAGAATG GAAAAAAGAA 900 

ATTTTATAAT CAAAGCCCAA 960 

TAAGCATTGC CATTGAAGGC 1020 

TATATCACTA AAAGCATGAA 1080 

TTAATATGGC TTCACCAAGA 1140 

TGGTCCTGAA TGTAGCGTGT 1200 

GGTGAAAIGG GGGTGGGGAG 126 0 

TTAAATTTTA TATCATTTTA 132 0 

TGAACATGAT AGTTAAACTA 13 80 

AGTTATTGGC ATGGTTGTTG 1440 

TTAATTACCT GGTGAGTAAC 1500 

TATGCCTTAA TTTAGAAACT 1560 

GACACCAGGG GTCCGCTGCC 1620 

TTGTATAGAA AATTCTTGAG 168 0 

GTGCCATATT CTGGTTCTTG 174 0 

TGCAAGTTCA TTTTAAATGA 1800 

AGGATTCAGC ACCAGTTATG 1860 

GAAATCCCTT TCTGCTAGTG 1920 

CATAGGAAGA AAATGTCAGT 198 0 

TTCATGCTTG CCATCTGTTC 204 0 

GAATTATTAG GAGTAATTCT 2100 

TTTCAGAAGT TAACATCAAG 2160 

TGACCACACA TTAGGCTGTG 2220 

GCTACTTGGG GACATAATTT 2280 

CTTTGCAGTG GGGCTAGGAC 234 0 

TAATTTGAAC AGGTCAAAGA 240 0 

TAAAAATAGG CTTTTTAGGA 2460 

ATATTTGTCC TTAAAGGGTT 252 0 

GCCATGTGCG GAATTCAAGT 2580 

GTGTACTTAT TACAGTCTTA 264 0 

TTGAGACCTT TCCTCTCCTG 2700 

ACGTGTCTTA GGCTGATGCC 2760 
ACCAATAAAC TTCCTTTTTT 



41 51 

: i 

EDYEPTKADS YRKKWLDGE 60 

FAATADFREQ ILRVKEDBNV 120 

RANVDKVFFD LMREIRARKM 180 



WO 02/079492 



PCT/US02/04915 



TTTTCCTGTT CTTAAGACGT GATTTTGCTG TAGAAGATGG CACTTATAAC CAAAGCCCAA 
AGTGGTATAG AAATGCTGGT TTTTCAGTTT TCAGGAGTGG GTTGATTTCA GCACCTACAG 
TGTACAGTCT TGTATTAAGT TGTTAATAAA AGTACATGTT AAACTTACTT AGTGTTATG 



5 Seq I 



Protein Accession #: NP_003458 



21 



31 



41 



51 



I I I I I I 

MEGISIYTSD NYTEEMGSGD YDSMKEPCFR EENANFNKIF LPTIYSIIFL TGIVGNGLVI 
LVMGYQKKLR SMTDKYRLHL SVADLLFVIT LPFWAVDAVA NWYFGNFLCK AVHVIYTVNL 
YSSVLILAFI SLDRYLAIVH ATNSQRPRKL LAEKWYVGV WIPALLLTIP DFIFANVSEA 
DDRYICDRFY PNDLWVWFQ FQHIMVGLIL PGIVILSCYC IIISKLSHSK GHQKRKALKT 
TVILILAFFA CWLPYYIGIS IDSFILLEII KQGCEFENTV HKWISIIEAL AFFHCCLNPI 
BYAFLGAKFK TSAQHALTSV SRGSSLKILS KGKRGGHSSV STESESSSFH SS 



Seq ID NO: 82 K 

Nucleic Acid Accession #: NM_014959 
Coding sequence: 314.. 1609 (underlined sequences correspond to start and stop codons) 



ATTTGACAAA TGGAAAAAAA 
CCGAGACGGG TATACAGGGA 
TTTTTCCAAG AAGATGATGA 



AAATAATGTT 
TTCCGGTCGC 
TTCAATGGTA 
GGAGTGTCCA 
GCTACCCTGT 



CATAGAGAAG 
CATGATGGCT 
AAAAAGATAC 
GAAAAGAGTA 
GTTTCTGAGA 



TTCCTCTTAT 
TCTGGGGCCT 
CGTTTGGTTC 
AAGGGATGAG 
CCTGCAGCAC 
AGAGGAGGCT 



GCTTCTAAAG 



CCGGGTGGAG 
GCTGCGGATC 
TTATCACCCC 
GCTAACAAAG 
GCCCCCAATG 
GAAAGTAATG 
CTCAAAATTC 
ACATGGGACT 
ATCAGCCCCT 
AGCCAGGATG 
TGAGAATGAG 



CCCACTGCTG 
GTCACAGTGA 
CATGAACAGT 
GTCGCCGAAA 
TTTCTCGTTG 
CCTTTCTATG 
GCCAGTGGGA 
CACCCCGAAG 
GCGATAGATG 
GAACCCCTGA 
CCCAAGGAGT 
TATGCTGGGC 
TTGGTGTGGG 
CCTCCTTTCT 



TGAAAGGGAC 
TAGGTAGTCT 

TCATTGATTT 



AATGTCTGAA 
CACTATTTTA 
GTGTTTATGA 
CCTATGGACA 
AAGCTTCTCA 
GTTAGGACTT 
TGAAAGAGAG 
GTGTGTAGCC 
CTGAGATGAA 
CAGCTTATTC 
ACATTTGGAT 
ATGTTTTCGT 
TAGTAGACAC 
AGTCTCTTTT 
ATTTATCCTC 
CCACTTAAAG 
TAGGATATAG 



GTGGAGAAGA 
CCTTACCTCG 
GGAAGAGAGA 
CAGTGTTCAA 
TGTAACCTGG 
GAAGGTAGTA 
TCCATTGACA 
AGGATGGGGC 
CTCCGTTTGA 
GCTGAGGACA 



TGCTACAAGG 
ATGTGGTAAA 
TGAGACAGAC 
TCrACCATAG 
GGCACACATA 
CATCTGGTTG 
AACTGGAAGA 



ATATATTTTC 
TTATGTTTTT 
GTTCTGCAAA 



TACTTTTAAT 
CCAGATATTT 
TTTCCACTTC 
TTCTCTTATT 
TTTTTGCTTG 



CATTCCCAGG 
TCTGTTTTGA 
TGGATGTTGA 
GCTGGTATCT 
CGATTGCGTT 
GGCTGGTGGG 
TCCACCTCCC 
CCCATTTTAA 
CTGTCCTC-GA 
CTCGCCTCTC 
ATATTAAGTT 
ATGAGGAAGA 
ACTTTGGTTC 
TGAAATTGTC 
AGATGAAGGA 
ATACTGAGGT 
CAGGTGCAGC 
AAGGGGTGCT 
TGGAGCAGGA 
AAGGGGACCT 
TGTCCTATCT 
ATCCAGCGTT 
GACAGAAGAA 
CATGTACCTA 
ATATTCCTTT 
TGATTCTTGA 
CTGGAAAGGC 
AGTATCACCT 
CTCAAGGCAT 
TCTATGGCTA 
GGAGAAAATG 
AGTTGTTGGG 
TCAACTCCAC 
ATTCCTGGCA 
ACTCTGTCAT 
TTTCCATCCT 
AGTCAGTTTT 
TTTCAATTAC 
TTTGTTATGT 
GCACTTTATC 
CATAATCTTT 
CACTCAGAAG 
ACCATTTCTG 
TTCTGATAGA 
TTTTGCTTCC 
TAGTATTTTT 



AGACATTTGC 
GATCGAAGAA 
GTTGATTGAT 
GTGGTCAGCC 



AGTAAAGACA 
GCAGCAGTGA 
CCCTTTGTGA 
TGTTCCGTGC 
TCAGAAGAGA 
GATTATAAAA 
AAGAGCACAA 
ACAGGCCTCG 



CGGCCCCTTG 
CCACTTCATC 
GAATGAAGGG 
AAGCCCCAGC 
CATCCCCATC 



iTGTCA 



AACGCGGTTA 
TAAATACCAC 
GGAAGAGCTG 
CATCTCACAT 
TGTTCCTGAG 
ATCAAATAGT 
ATCGTCAGTT 
ACAGATACAG 
GCTTCCTGGT 
TGGCCCTGGA 
CTGCAGAGCC 
GTGAGGTGGA 
AGCATCCAGC 
TGGGCATCCT 



TCGCTTCCAT 
CAGTTATATT 
CTACAGGAGC 
ACCCATTCAA 
GAAGCCAGTG 
CTTTGTGAAG 



GGCCCTGGAC 
TAGACAGCAG 
CTCATTGGAA 
GACTGGGTAA 
TTGACTGTAT 
TAAATTTTTT 
AGACCCAGGA 
AACTTTTCCT 
TCTCATAACT 
ACATGATGAC 



CTTGTCCCCA 
GGTGTGCGCC 
GTGTCTAATT 
CCTGGAGAAA 
CTTGAGATTA 
GATCTCCAGC 
GAGAACCACC 
CAGGACAATG 
CAGAGCAAGA 
GTGCTCTTCA 
AATTTGTAAA 
ATGGATAAAC 



GCGACGCCTT 
TGCAGACTTC 
CTGCTAACCT 
TTCAGCACTT 
CTGAAAAAAG 



AAAAGCAGAA 



CAAACATGAG 
TTAAATGTTC 
AGAACCACCA 
ATGTACCATA 
TTTGTAGCCA 
TTTATGTTTA 
TTTTATGGTG 
TTACGTTAAT 
CTTTTCTATT 
ACCATTTAGA 
CTGTATCTTA 
ATTTAAAAAA 
TTGCTCTTCC 
CGTTTTTTAG 
TCTGAGGACA 
AGTTGACATT 



AACAATGTAA 
CAGAAAATAT 
ATGGTCAACT 
AAAAGAAAAT 
CAAAAAATAA 
TTTCAGCTGT 
ATCTGTTTAA 
TATTTTGAAC 
GTAATTATTA 
CTACTGTTAT 
TCCTATTACC 
AAACCCATCA 
CTCATGAGAA 
TTCATTATTT 
TTCTTTTAGA 



GTTTTCTGTT 



GGCAACTCCA 
AGGTTCTTAC 
ATGAGGCCTT 
GAAGCATTAG 
ATGAGTCAGT 
AGAAATGTGA 
ACAGGCTTTC 
TTTCCTCAAG 
CTTGATATAT 
CCGGATAGGT 
AAAAATAATT 
AAGCTAACAA 
TTTTTTGTAT 
ATGTATTTGC 
TTTCCCACTG 
CCAAGGAATA 
GAGGGTAATT 
GCCAACTCTG 
CTTTTGATTA 
AACAAGTTCC 
CCATTCTGAT 
ATATGTTAGG 
CACATTTGTA 
TTTTCTCATC 
ATAAATTATT 
GAGTAATCTG 
TCCAAATTTC 
GTGGTTCTGA 
ACCTTCATTC 
CAGCAGTTTC 



1320 
1380 
1440 
1500 
1560 
1S20 
1680 



2040 
2100 
2160 

2220 



246C 
2520 
2580 
2640 
2700 
2760 



2940 
3000 
3060 
3120 
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CTTTTAGCTT CCGTATTTCC TGATGAGAAA TCTGCAGTCA ITCAAATTGT 1 

TATGTAGTGT GTCATTTTTC TGTCAGATTT CAAGGTATTT ATCTTTAGTT TTTAGCCATT 3240 

TCATTATGTT GGGGATGAGT TTCCTTGTTT TATTCCCTTT GGAATTTGCT CCAATTCATA 3300 

AATTTGCAGT TTTATGTCTT TTACCAAACT TAGAGGTTTT CAGCCTAATT TCTAAAAATA 3 3 SO 

CTTTTTATTA GCCTGATTTT CATCTTTATA GGAAATAGTT TAAGTGATGA CAAGTTCCAA 3420 

TAGCTTATAT GCCCAGAAGG CCTTCAAAAT AAGAATTTTG AAAGAATACA GAAAACAAAC 34 80 

TTTTATATCC TTCTCATGTC TTCTACTGTA AAATTCATAT GCTTTGCTAC TCTAAACCTA 3540 

GTTTGAAATC AACAGTCTTG AGAATAGATG AAAATTTTGA TGAATAGTGG AATTCTTTTA 3600 

AATGGAAACC TCTTACATGT GATTTTCCTT GCCATCTAGA AATAAACCAT AGTATTTATG 3 6 SO 

TTGAATCAAT CAATATTATA TTTTGTTTTT TTCCTCCTCT TCTGAGACTC TTATTGTGGA 3720 

AATGTTAGAC TTTTATGTTT TCCTAAATGT CCCTGATATT CTACTTATTT AGAACATCTT 3 780 

TTCATTTTTT CCATTATTCT GATTGGGTAA TTTTAATTTG TCTATTTTCA AATTTGCTGG 3840 

AGTGTTCACC TGTTGTTGTC TGTGTCGTCC CACTGAGTGC ATTCACCACC TTTTAAATTT 3900 

TGGTCACTGT ATGTATCAGT TCTAAAATTT CCATTTTGTT CTCTATATTT TAAATTTCTT 39S0 

GGCTTATATT CTATTTTCCT GCAAATGTGT CAGCATTTGC TTGTTTGAGC TTTTTTTTTT 4 020 

TCAAGACAGG GTCTCAACTC TGTTACCCAG GCTGGAGTGC AGTGGTGCGA TCTCAGCTCA 4 080 

CTGCAACCTC TGCCTCCTGG TTCAAGCGAT TATTGTGCCT CAGCCTCCTG AGTAGCTGGG 4140 

ATTACAGGCA TGCACCACCA CAGCCCAGCT AATTTTTTGT ATTTTTAGTA GAGACAGAGT 42 00 

TTTGCTATGT TGGCCAGGCT GGTTTTGAAC TCCTGGCCTC AAGTGATCCA CCCACCTCAG 4 2 SO 

CCTCCCAAAG TGCTGGGATT ACAGGCCACT ACACCTGGCA CATTTGAGTA TTTTTTTTTT 4320 

TTTTTTTTTT TTGAGATGGA GTCTCGCTCT GTCATCTAGG CTGGAGTGCA GTGGTGTGAT 43 80 

CTCAGCTCAC TGCAGCCTCT GTCTCCCGGG CTCAAGCGAT TCTCTTGCCT CAGCCTCCTG 4440 

AGTAGCTAGG ACTACAGGTG CATGCCAACA CGCCCGGCTA ATTTTTTTAA AAAATATTTT 4500 

TAGTAGAGAC AGGGTTTCAC CATTTTGGCC AGGATGGTCT CGATCTCCTG ACCTCATGAT 4 5 SO 

CCACCCGCCT CGGCCTTCCA AAGTGCTGGG ATTACAGGCA TGAGCCACCG TGCCTGGCCT 4 620 

CATTTGAGTA TTTTTATAAT GTCTCTTTTA AAGTCTTTGT CAGATAATTC CACTGTACAT 4 6 80 

GTTATTCAGT GTTTGGTGTC CACTGAGTTG TCATTTGCCA GACAAGTGGA GATTTTTGCA 4 740 

GCTCATCCTT GTATTCTCAG TAGTTCCGAT ATGTACCCTC GACATGTGAA TGTTATCTTA 48 00 

TGAGACTCTG TTTTATTTGT ATCCAACAGA AGATGTTTAT TATTTATTTG GCTTTCTGTG 4 8 SO 

AACTGAGGTC TTAATATCAG CTCATTTTAA AAGTCTTTGC AGTGGTATTC GGATCTATCC 4920 

TGTGTGTGCC TATGAGATTG GGTGCAGTGT ATCCTGTTAG CTCCATTCTC AGGGCGTTTG 4 9 BO 

AATGTGAATT AGGACCAGCG CAATGAATGC TCAAGTTGGG GTTGGGCGTT AGAATTCATA 5040 
AAAGTCTTTA TATGCTCAG 



1 11 21 31 41 51 

I I I I I I 

MMRQRQSHYC SVLFLSVNYL GGTFPGDICS EENQIVSSYA SKVCFEIEED YKNRQFLGPE SO 

GNVDVELIDK STNRYSVWFP TAGWYLWSAT GLGFLVRDEV TVTIAFGSWS QHLALDLQHH 120 

EQWLVGGPLF DVTAEPEEAV AEIHLPHFIS LOGEVDVSWF LVAHFKNEGM VLEHPARVEP 180 

FYAVLESPSF SLMG1LLRIA SGTRLSIPIT SNTLIYYHPH PEDIKFHDYL VPSDALLTKA 240 

IDDEEDRFHG VRLQTSPPME PLNFGSSYIV SNSANLKVMP KELKLSYRSP GEIQHFSKFY 3 00 

AGQMKEPIQL EITEKRHGTL VWDTEVKPVD LQLVAASAPP PFSGAAFVKE NHRQLQARMG 3 SO 

DLKGVLDDLQ DNEVLTENEK ELVEQEKTRQ SKNEALLSMV EKKGDLALDV LFRSISERDP 420 
YLVSYLRQQN L 

Seq ID NO: 84 Nucleotide sequence: 
Nucleic Acid Accession #: NM_00703S 

Coding sequence: 56-610 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CTTCCCACCA GCAAAGACCA CGACTGGAGA GCCGAGCCGG AGGCAGCTGG GAAACATGAA SO 

GAGCGTCTTG CTGCTGACCA CGCTCCTCGT GCCTGCACAC CTGGTGGCCG CCTGGAGCAA 120 

TAATTATGCG GTGGACTGCC CTCAACACTG TGACAGCAGT GAGTG CAAAA GCAGCCCGCG 180 

CTGCAAGAGG ACAGTGCTCG ACGACTGTGG CTGCTGCCGA GTGTGCGCTG CAGGGCGGGG 240 

AGAAACTTGC TACCGCACAG TCTCAGGCAT GGATGGCATG AAGTGTGGCC CGGGGCTGAG 300 

GTGTCAGCCT TCTAATGGGG AGGATCCTTT TGGTGAAGAG TTTGGTATCT GCAAAGACTG 3S0 

TCCCTACGGC ACCTTCGGGA TGGATTGCAG AGAGACCTGC AACTGCCAGT CAGGCATCTG 420 

TGACAGGGGG ACGGGAAAAT GCCTGAAATT CCCCTTCTTC CAATATTCAG TAACCAAGTC 4 80 

TTCCAACAGA TTTGTTTCTC TCACGGAGCA TGACATGGCA TCTGGAC-ATG GCAATATTGT 540 

GAGAGAAGAA GTTGTGAAAG AGAATGCTGC CGGGTCTCCC GTAATGAGGA AATGGTTAAA 600 

TCCACGCTGA TCCCGGCTGT GATTTCTGAG AGAAGGCTCT ATTTTCGTGA TTGTTCAACA 6 SO 

CACAGCCAAC ATTTTAGGAA CTTTCTAGAT ATAGCATAAG TACATGTAAT TTTTGAAGAT 720 

CCAAATTGTG ATGCATGGTG GATCCAGAAA ACAAAAAGTA GGATACTTAC AATCCATAAC 780 

ATCCATATGA CTGAACACTT GTATGTGTTT GTTAAAT AT T CGAATGCATG TAGATTTGTT 840 

AAATGTGTGT GTATAGTAAC ACTGAAGAAC TAAAAATGCA ATTTAGGTAA TCTTACATGG 900 

AGACAGGTCA ACCAAAGAGG GAGCTAGGCA AAGCTGAAGA CCGCAGTGAG TCAAATTAGT 9S0 

TCTTTGACTT TGATGTACAT TAATGTTGGG ATATGGAATG AAGACTTAAG AGCAGGAGAA 1020 

GATGGGGAGG GGGTGGGAGT GGGAAATAAA ATATTTAGCC CTTCCTTGGT AGGTAGCTTC 1080 

TCTAGAATTT AATTGTGCTT TTTTTTTTTT TTTGGCTTTG C-GAAAAGTCA AAATAAAACA 1140 
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ACCAGAAAAC 



TGAAGGACGG 
CCACCTCAGA 
GTAAATATTT 



AACCTATGAC 
TATAGGAGTC 
TAAACATAAG 
GGAGGTTTGT 
TGGAATTAGG 
CTTAGGAAAT 
AGTATTTACC 
GCCTTTGAAT 
TTGTTCAATA 



CCCTGAAGGA 
TGAGAGCAAT 
TTCTGGGGCA 
GATAAATCTA 
ATATATTTTT 
CTCTTGAGGA 
TCTATAAGGT 
ACTCTGGATT 
TGCTGTGACT 
AAAAGAAGAA 
AGTATATTTG 
ATCTCAGAAG 
TGTATTTTAT 
GTAAAGCTGC 



AGTAAGATGT 
TTCAAAAGGC 
TAGGAAACAC 
AGAAGTATTT 
ATAAATAAAT 
AAGAAATCTA 



TTGAAGCTTA TGGAAATTTG AGTAACAAAC 
TGCTGATGTA GTTCCCGGGT TACCTGTATC 
ATACACTTCC ATAAATAGCT TTAACGTATG 
TGGTTTGTGT GTGTATGAAG 



TCGGTGAATT 
TCAATTTTCA 
AAAGAATCTT 
TATTTTATTT 
TCTTGAAGTT 
ATAAGCTGTT 
AGATAC 



GTATTATTTG 
CTGAGGCATG 
TCAAAAAATG 
TTCAATTTAA 
GCAGAAAACA 
AGCACAAACA 
GAAGTGAAGA 
GGCCAACAGA 
AGGTTTTGTT 



TTC-AAAATGG 
ATAAATTTAT 
AGCAACAGAG 
G3TATGAAAA 
TGTCAACTTT 
GGACTGTTGT 
AC T T AT T T AA 
GTTGTGAATG 
TTAAAAGGAC 



TTAGAATAAA 
TATCCATAAT 
GGACCTTATT 
TAAGTTTTTA 
AAAATATAGG 
ACTAGATGTT 



1440 
1500 
15S0 



1800 
18S0 
1920 
1980 



I I I I I 

j LVPAHLVAAW SNNYAVDCPQ HCDSSECKSS PRCKRTVLDD CGCCRVCAAG 
RGETCYRTVS GMDGMKCGPG LRCQPSNGED PFGEEFGICK DCPYGTFGMD CRETCNCQSG 
ICDRGTGKCL KFPFFQYSVT KSSNRFVSLT EHDMASGDGN IVREEWKEN AAGSPVMRKW 



Seq ID NO: 86 Nucleotide sequence: 
Nucleic Acid Accession #: D86983 

Coding sequence: 52-4491 (underlined sequences correspond t 



start and stop codons) 



I 

AGCCGGCCGT 
CGCTCCAGGG 
ACGCTGGCCG 
CGCACCACCG 
ACCTCCATCC 
CGGCTGAGGA 



11 

I 

GGTGGCTCCG 
GCCCCGGGCG 
TGGTGGCCCA 
TGCGCTGCAT 
TAGATCTTCG 
ACTTGAACAC 



21 

I 

TGCGTCCGAG 
CCGCTGCCTG 
GAAGCCGGGC 
GCATCTGCTG 
CTTTAACAGA 



TCAATTGACA 
AATCAGATAG 
TTTTTGCATA 
ATGAAGAGAT 
GCGGATTTGC 
TATCCCAGAC 
GAAAGGCCCC 
TACTTCACCT 
AATGAGCTGA 
ATCCAGAACA 
GGAGAGGTGA 
TTTGTAATCC 
AGCGCCACAG 
CCAGTTGACC 
CAGGGGGACA 
ACCGCTTTCA 
GTTATTGAGG 
' ATCGCCTGGA 
TCGGGAACAC 
GCTGTCAACA 
ACCCCAGTGT 
CTCCCGTGCA 
CAGGTGACAG 
GTTGGCCCTG 



GGCAAGCATT 
AAACTTTGGA 
ACAACCGGAT 
TGCGACTGGA 
TGAAAACCTA 
GCATCCAGGG 
GGATCACCTC 
GCAGAGCCGA 
GCATGAAGAC 
CACAGGAGAC 



AAATTTAAAA 
TAAGGGACTT 
CCCAGATTCG 
TACACATTTA 
CTCAAACACA 
CGCGGAGTCG 
ACGCTCAGTG 



AGCCACAGAA 
GCCACCCCCC 
CGCGGGTGAA 



AGGCAACCCC 
AGATTCCCGC 
AGACCAGGGT 
GGTGACCCTC 



31 

I 

CGTCCGTCCG 
TTGGCGCTCG 
GCAGGGTGTC 
CTG3AGGCCG 
ATCAGAGAGA 
AATAATAATC 
TATCTCTATC 
GCCTCTCTAG 
TTCCAGCATC 
GTTCCAGGGA 
CTTCACTGCG 
GGGAACGCGC 
GCAACCATCA 
GACGCAGATG 
AAGCCTGAGA 
CTAAACTTGC 
ATCTACCAGT 
AGGTACTTCG 



TCATCGTCCA 



ACACATTTGT 
CCGAGGGATC 
CAGCTCATTC 
CACTACAACG 
ACCGCCCACC 
CACGACGGCA 
GAGCGCCTGC 
CACCGACTGT 
GGGACGGAGA 



CCAAGGGAGG 
TTAGAATCTC 
TCATCGGCTC 
TTGCCAGCAT 
GCTCCCAGGG 
AAAGTGGAAA 
CAGACGCAGG 
TGGTGCTCAG 
CCATCGTGGA 
TTGACAGCCG 
CTTACACAGT 
AGGAGCATGT 
ACCTGGTGTC 
GGCGCGTGAA 
CCTGTAACAA 
TGAAATCCGT 
ACAACGGGCA 
CCGTCACACC 



CATCACGCCT 
TGCGTGCTCT 
GGCTCTTCCT 
GGATTTCCAG 
GAGCCAGCTC 
TGGTGTTGCC 
CCAGAAGGTC 
TCCCAGCGAC 
CGAGCCCGAG 
ATTTCACATC 
TCGCTATGAG 



TCCTCGTTCT C 



CCTGCAGCAC 
GTACGAGAAT 
CGCCCTTCCC 
CGACGAGCAG 



I 

CGCCGTCGGC 
TGCTGTTCTG 
CGAGCCGCTG 
TGCCCGCCGT 
TCCAACCTGG 
AGATCAAGAG 
TGTACAAGAA 
AGCAACTATA 
TCCCGAAGCT 
CATTTAATCA 
ACTGTGAAAT 
AGGCAGCGGC 
CCCCGGAAGA 
TGACCTCGGG 
TCATCTGGCT 
TGGACGATGG 
GCATGGCAAA 
GGTCTCCAGC 
AGAGCGTCAC 
GAGGTGACCG 
TTTACATACA 
ACATTGACAG 
TGACGCCTCA 
AGGGCAACCC 
TCCGTGGACC GGCGGCACCT 
AGGGCCAGTA 
TGACTGTGCA 
AGGTGGGCGC 
CCTGGAACAA 
GATTCTTGAC 
GGAACACCAT 
GTCGAAATGG 
GAGCTATAAA 
TGCTGGCCTT 
AAATCTTTGA 
ACCTCAACGG 
TCGCAAACCT 
TCCACCAGAA 
GCGCCTCGCT 
CCCCTCGGGG 
TGGTGTCCAC 
TGCTGATGCA 



TCTGGCGGGC 
GCGACCAACA 
CAGTTCACTG 



GTGGCCCACC 
ACAACAGTGG 
CCAGCCATCA 
AGCCCTGAAG 
TGTGTGGCCC 



GGCATTCAGG 
GATACCTAGT 
TGAGATCCAG 
CCTGCACTTT 
CGAGAGGCTA 
CTTGGAATCT 
CCTGTGGTTG 
CATCTGTGAA 
GCTGAACTGT 
GAACACCGTG 
GCGAAACAAT 
GACCCTGATG 
GAACGTGGCC 
TCGACCCACT 
GCTGGAGTGC 
CACACCCTTG 
GAACGTCGTA 
CGTCCATGCC 
GGACAGAGTC 
GCCGCCCGTC 
GGTCCTGTCA 
CGAATGCCAG 
GCCCAGAGTC 



TTGATGGTCG 
CTGAACCTCA 
GACATGTGCT 
CCCATGTGGG 
GGCTTCAACA 
ATGCCGCGCC 
TTCACCCACA 



GGATGGGGTT 
CATCAATGAC 
TGGGTCGGCC 
AGATCCGTTT 
CTCAACCCGA 
GTTCCGGTAT 
ACGGACATTG 
AACAAGTTAC 
GTCGGGCTGT 
GTACCGGACG 
GACCGCCTTC 
CATCAACCCC 
CACCCTGATC 
GTGGGGCCAG 



1020 
1080 
1140 
1200 
1260 
1320 
13 80 
1440 
1500 
1560 
1620 
1680 

1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
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TTCCTGGACC ACGACCTCGA CTCCACGGTG GTGGCCCTGA GCCAGGCACG CTTCTCCGAC 2580 

GGACAGCACT GCAGCAACGT GTGCAGCAAC GACCCCCCCT GCTTCTCTGT CATGATCCCC 2640 

CCCAATGACT CCCGGGCCAG GAGCGGGGCC CGCTGCATGT TCTTCGTGCG CTCCAGCCCT 2700 

GTGTGCGGCA GCGGCATGAC TTCGCTGCTC ATGAACTCCG TGTACCCGCG GGAGCAGATC 27S0 

AACCAGCTCA CCTCCTACAT CGACGCATCC AACGTGTACG GGAGCACGGA GCATGAGGCC 2820 

CGCAGCATCC GCGACCTGGC CAGCCACCGC GGCCTGCTGC GGCAGGGCAT CGTGCAGCGG 2 8 80 

TCCGGGAAGC CGCTGCTCCC CTTCGCCACC GGGCCGCCCA CGGAGTGCAT GCGGGACGAG 2940 

AACGAGAGCC CCATCCCCTG CTTCCTGGCC GGGGACCACC GCGCCAACGA GCAGCTGGGC 3 000 

CTGACCAGCA TGCACACGCT GTGGTTCCGC GAGCACAACC GCATTGCCAC GGAGCTGCTC 30S0 

AAGCTGAACC CGCACTGGGA CGGCGACACC ATCTACTATG AGACCAGGAA GATCGTGGGT 3120 

GCGGAGATCC AGCACATCAC CTACCAGCAC TGGCTCCCGA AGATCCTGGG GGAGGTGGGC 3180 

ATGAGGACGC TGGGAGAGTA CCACGGCTAC GACCCCGGCA TCAATGCTGG CATCTTCAAC 3240 

GCCTTCGCCA CCGCGGCCTT CAGGTTTGGC CACACGCTTG TCAACCCACT GCTTTACCGG 33 00 

CTGGACGAGA ACTTCCAGCC CATTGCACAA GATCACCTCC CCCTTCACAA AGCTTTCTTC 3 3 SO 

TCTCCCTTCC GGATTGTGAA TGAGGGCGGC ATCGATCCGC TTCTCAGGGG GCTGTTCGGG 3420 

GTGGCGGGGA AAATGCGTGT GCCCTCGCAG CTGCTGAACA CGGAGCTCAC GGAGCGGCTG 34 80 

TTCTCCATGG CACACACGGT GGCTCTGGAC CTGGCGGCCA TCAACATCCA GCGGGGCCGG 3 540 

GACCACGGGA TCCCACCCTA CCACGACTAC AGGGTCTACT GCAATCTATC GGCGGCACAC 3S00 

ACGTTCGAGG ACCTGAAAAA TGAGATTAAA AACCCTGAGA TCCGGGAGAA ACTGAAAAGG 36S0 

TTGTATGGCT CGACACTCAA CATCGACCTG TTTCCGGCGC TCGTGGTGGA GGACCTGGTG 3720 

CCTGGCAGCC GGCTGGGCCC CACCCTGATG TGTCTTCTCA GGACACAGTT CAAGCGCCTG 3780 

CGAGATGGGG ACAGGTTGTG GTATGAGAAC CCTGGGGTGT TCTCCCCGGC CCAGCTGACT 3840 

CAGATCAAGC AGACGTCGCT GGCCAGGATC CTATGCGACA ACGCGGACAA CATCACCCGG 3900 

GTGCAGAGCG ACGTGTTCAG GGTGGCGGAG TTCCCTCACG GCTACGGCAG CTGTGACGAG 39S0 

ATCCCCAGGG TGGACCTCCG GGTGTGGCAG GACTGCTGTG AAGACTGTAG GACCAGGGGG 4 020 

CAGTTCAATG CCTTTTCCTA TCATTTCCGA GGCAGACGGT CTCTTGAGTT CAGCTACCAG 4 030 

GAGGACAAGC CGACCAAGAA AACAAGACCA CGGAAAATAC CCAGTGTTGG GAGACAGGGG 4140 

GAACATCTCA GCAACAGCAC CTCAGCCTTC AGCACACGCT CAGATGCATC TGGGACAAAT 4200 

GACTTCAGAG AGTTTGTTCT GGAAATGCAG AAGACCATCA CAGACCTCAG AACACAGATA 4 2 SO 

AAGAAACTTG AATCACGGCT CAGTACCACA GAGTGCGTGG ATGCCGGGGG CGAATCTCAC 4320 

GCCAACAACA CCAAGTGGAA AAAAGATGCA TGCACCATTT GTGAATGCAA AGACGGGCAG 4380 

GTCACCTGCT TCGTGGAAGG TTGCCCCCCT GCCACCTGTG CTGTCCCCGT GAACATCCCA 4440 

GGGGCCTGCT GTCCAGTCTG CTTACAGAAG AGGGCGGAGG AAAAGCCC TA G GCTCCTGGG 4500 

AGGCTCCTCA GAGTTTGTCT GCTGTGCCAT CGTGAGATCG GGTGGCCGAT GGCAGGGAGC 4560 

TGCGGACTGC AGACCAGGAA ACACCCAGAA CTCGTGACAT TTCATGACAA CGTCCAGCTG 4620 

GTGCTGTTAC AGAAGGCAGT GCAGGAGGCT TCCAACCAGA GCATCTGCGG AGAAGGAGGC 4680 

ACAGCAGGTG CCTGAAGGGA AGCAGGCAGG AGTCCTAGCT TCACGTTAGA CTTCTCAGGT 4740 

TTTTATTTAA TTCTTTTAAA ATGAAAAATT GGTGCTACTA TTAAATTGCA CAGTTGAATC 4 800 

ATTTAGGCGC CTAAATTGGT TTTGCCTCCC AACACCATTT CTTTTTAAAT AAAGCAGGAT 4 860 

ACCTCTATAT GTCAGCCTTG CCTTGTTCAG ATGCCAGGAG CCGGCAGACC TGTCACCCGC 4920 

AGGTGGGGTG AGTCTCGGAG CTGCCAGAGG GGCTCACCGA AATCGGGGTT CCATCACAAG 4980 

CTATGTTTAA AAAGAAAATT GGTGTTTGGC AAACGGAACA GAACCTTTGA TGAGAGCGTT 5 040 

CACAGGGACA CTGTCTGGGG GTGCAGTGCA AGCCCCCGGC CTCTTCCCTG GGAACCTCTG 5100 

AACTCCTCCT TCCTCTGGGC TCTCTGTAAC ATTTCACCAC ACGTCAGCAT CTAATCCCAA 5160 

GACAAACATT CCCGCTGCTC GAAGCAGCTG TATAGCCTGT GACTCTCCGT GTGTCAGCTC 5220 

CTTCCACACC TGATTAGAAC ATTCATAAGC CACATTTAGA AACAGATTTG CTTTCAGCTG 5280 

TCACTTGCAC ACATACTGCC TAGTTGTGAA CCAAATGTGA AAAAACCTCC TTCATCCCAT 5340 

TGTGTATCTG ATACCTGCCG AGGGCCAAGG GTGTGTGTTG ACAACGCCGC TCCCAGCCGG 5400 

CCCTGGTTGC GTCCACGTCC TGAACAAGAG CCGCTTCCGG ATGGCTCTTC CCAAGGGAGG 5460 
AGGAGCTCAA GTGTCGGGAA CTGTCTAACT TCAGGTTGTG TGAGTGCGTT 



BAA13219 

1 11 21 31 41 51 

I I I I I I 

SRPWWLRASE RPSAPSAMAK RSRGPGRRCL LALVLFCAWG TLAWAQKPG AGCPSRCLCF 60 

RTTVRCMHLL LEAVPAVAPQ TSILDLRFNR IREIQPGAFR RLRNLNTLLL NNNQIKRIPS 120 

GAFEDLENLK YLYLYKNEIQ SIDRQAFKGL ASLEQLYLHF NQIETLDPDS FQHLPKLERL 180 

FIiHNNRITHL VPGTFNHLES MKRLRLDSKT LHCDCSILWL ADLLKTYAES GNAQAAAICE 240 

YPRRIQGRSV ATITPEELNC ERPRITSEPQ DADVTSGNTV YFTCRAEGNP KPEIIWLRNN 300 

NELSMKTDSR LNLLDDGTLM IQNTQETDQG IYQCMAKNVA GEVKTQEVTL RYFGSPARPT 36 0 

FVIQPQNTEV LVGESVTLEC SATGHPPPRI SWTRGDRTPL PVDPRVNITP SGGLYIQNW 420 

QGDSGEYACS ATNNIDSVHA TAFIIVQALP QFTVTPQDRV VIEGQTVDFQ CEAKGNPPPV 480 

IAWTKGGSQL SVDRRHLVbS SGTLRISGVA DHDQGQYECQ AVNI IGSQKV VAHLTVQPRV 54 0 

TPVFASIPSD TTVEVGANVQ LPCSSQGEPE PAITWNKDGV QVTESGKFHI SPEGFLTIND 600 

VGPADAGRYE CVARNTIGSA SVSMVLSVNV PDVSRNC-DPF VATSIVEAIA T\T3RAINSTR 660 

THbFDSRPRS PNDLLALFRY PRDPYTVEQA RAGEIFERTL QLIQEHVQHG LMVDLNGTSY 720 

HYNDLVSPQY LNLIANLSGC TAHRRVNNCS DMCFHQKYRT HDGTCNNLQH PW1GASLTAF 78 0 

ERLLKSVYEN GFNTPRGINP HRLYNGHALP MPRLVSTTLI GTETVTPDEQ FTHMLMQWGQ 840 

FLDHDLDSTV VALSQARFSD GQHCSNVCSN DPPCFSVMIP PKDSRARSGA RCMFFVRSSP 900 

VCGSGMTSLL MNSVYPREQI NQLTSYIDAS NVYGSTEHEA RSIRDLASHR GLLRQGIVQR 960 

SGKPLLPFAT GPPTECMRDE NESPIPCFLA GDHRANEQLG LTSMHTLWFR EHNRIATELL 102 0 

KLNPHWDGDT IYYETRKIVG AEIQHITYQH WLPKILGEVG MRTLGEYHGY DPGINAGIFN 1080 

AFATAAFRFG HTLVNPLLYR LDENFQPIAQ DHLPLHKAFF SPFRIVNEGG IDPLLRGLFG 1140 

VAGKMRVPSQ LLKTELTERL FSMAHTVALD LAAIN1QRGR DKGIPPYHDY RVYCNLSAAH 1200 

TFEDLKNEIK NPEIREKLKR LYGSTLNIDL FPALWEDLV PGSRLGPTLM CLLSTQFKRL 1260 
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RDGDRLWYEN PGVFSPAQLT QIKQTSLARI LCDNADNITR VQSDVFRVAE FPHGYGSCDE 1320 

IPRVDLRVWQ DCCEDCRTRG QFNAFSYHFR GRRSLEFSYQ EDKPTKKTRP RKIPSVGRQG 1380 

EHLSNSTSAF STRSDASGTN DFREFVLEMQ KTITDLRTQI KKLESRLSTT ECVDAGGESH 1440 

ANUTKWKKDA CTICECKDGQ VTCFVEACPP ATCAVPVNIP GACCPVCLQK RAEEKP 



3 start and stop codons) 



I 

AATTCGAGGA 
TTGGTGGCAA 
CCTCTCCTCC 
CACCTATGGA 
AGTTATGGAT 
GAAATACTCT 
TCCAGGACAT 
AGACCTTGTG 
CAGAGAAATC 
CAAGGGCCAG 
GAGTGCTCAG 



11 



21 



TGAAAGGCAA 
CGAGAAAGAT 
TGAACAGGAA 
AGATTTCCTG 
GTTACTACAG 
GAGACAGAAG 



AAAGAGGCGT 
AGAAGAAGAA 
GGAGCAGCGG 
GCATGACCAT 



CCCACTGCAG 
TGAGCCCAGG 
CTCCTCAGGG 
CTCCGGGGAA 
CCTGGAAAAT 
TGCTGGCGAA 
GCCACCTCAC 
GGAGGACGAC 



TGTCCATGAT 
CGTCCGCCAG 
TACACCTTTT 
ATCTGTGGTG 
CCGGAAAGGC 



AGTGAATTTG 
GAAGGTCTAT 
GAATGTCTTG 
GTTAAGAAAT 
CGTAGGGGAT 



CAAATTTATG 
CACTGTTGAG 
TGTTGATGTG 
CCCACACTCT 
AGATGGAATG 
AAGGATCACC 
TCGATCCAAT 
TGGTCACTTG 
ACGCAATGAC 
CATGACCTTA 
TGGCCTCCAG 
CTGCACCGAG 
CTCCTTCCTG 



I I 
TCCGGGTACC ATGGCACAGA 
AAAGGGAA AA TG GCGAACGA 
CTGCGGGATC CTGCTGGGAT 
CAAGTCTATA AGGGTCGACA 
GTCACTGAGG ATGAAGAGGA 
CATCACAGAA ACATTGCAAC 
GATGACCAAC TCTGGCTTGT 
AAGAACACCA AAGGGAACAC 
CTGAGGGGAC TGGCACATCT 
AATGTGTTGC TGACTGAGAA 
CTGGACAGGA CTGTGGGGCG 
GAGGTCATCG CCTGTGATGA 
TCTTGTGGCA TTACAGCCAT 
CCAATGAGAG CACTGTTTCT 
TGGTCGAAGA AGTTTTTTAG 
CCCTCTACAG AGCAGCTTTT 
GTTAGAATCC AGCTTAAGGA 
GAAACTGAGT ATGAGTACAG 
GGAGAGCCAA GTTCCATTGT 
AGACTGCAGC AGGAGAACAA 
GAGCAACAGC TCCGGGAGCA 
CGGATTGAGC AGCAGAAAGA 
GAGGCTAGAA GGCAGCAGGA 
CTAGAGGAGT TGGAGAGAAG 
AAGAGGAGAG TTGAAAGAGA 
CACTTGGAAG TCCTTCAGCA 
AGGAGGCCGC ACCCGCAGCA 
CCAAGCTTCC ATGCTCCCGA 
GTTCCTGTGA GAACAACATC 
GGCAGTGGGC AGCAGAATAG 
CTTCTGTGGG AGAGAGTGGA 
TCCAGCAACT CAGGATCCCA 
CGCTTCAGAG TGAGATCATC 
GCAGTGAAAA AACCTGAAGA 
GTGGATCTGA CCGCACTGGC 
AAAGTAACGG ACTACTCCTC 
GATGTGGAGC AGGAAGGGGC 
TCATCTCTGA ATTTGAGCAA 
GATGTAGAAA GTGAGCCGGC 
ACTCAGTCCG CTAGTAGCAC 
ATAGACCCCA GATTACTACA 
GGATTTTCCT GTGATGGGAT 
TCAGTGGTCA ATGTGAATCC 
AAATACAAGA AGAGGTTTAA 
CTAGTGGGTA CAGAGAGTGG 
CCTCTTATCA ACCGAAGACG 
GTGACAATAT CTGGCAAAAA 
AAAATACTTC ACAATGATCC 
TTGGAAGGAT GTGTACATTA 
GCTTTGAAGA GTTCTGTGGA 
GCCTTTAAGT CATTTGGAGA 
GAAGGCCAGA GGTTGAAAGT 
GATTCAGGAT CAGTCTATGA 
ATGATCCAGT GTAGCATCAA 
GAGCTTCTGG TGTGCTATGA 
AAGGATGTAG TTCTACAGTG 
CAGACAATGG GCTGGGGAGA 
GATGGTGTGT TCATGCACAA 
AAGGTGTTCT TTGCCTCTGT 
GGCAGGACTT CTCTTC TGA G 
AGTCTTCAAG ATCCTGAGAA 
GGCAACCAGG ACAGCTGTGT 
TTCCTCTTAT ATACCAGTTT 



31 

I 

GCGACAGAGA 
CTCCCCTGCA 
TTTTGAGCTG 
TGTTAAAACG 
AGAAATCAAA 
ATATTATGGT 
TATGGAGTTC 
ACTCAAAGAA 
TCACATTCAT 
TGCAGAGGTG 
GAGAAATACG 
GAACCCAGAT 
TGAGATGGCA 
CATTCCCAGA 
TTTTATAGAA 
GAAACATCCT 
T CAT AT AG AT 
TGGGAGTGAG 
GAACGTGCCT 
GGAACGTTCC 
GGAAGAATAT 



CATTTATTGT 
AAAAGTCTGC- 
GTGGAAGTGG 
GGTCAGTTGG 



TATTTGTTTT 



GCTTTCATCA 
TGTGGGGCTG 
GACTGGATCG 
CATGTGATTC 
AAACTTGTTG 
TTCATAGGCA 
GCCACCTATG 
GAAGGTGCTC 
AACCCTCCTC 
GGGTGCCTGG 
TTTATAAGGG 
CGTACCAGGA 



TTGGAAATGG 
CAGCCATCAA 
ATATGCTAAA 
AAAAGAGCCC 
GGTCCATTAC 
CTTACATCTC 
ACCGGGATAT 
ACTTTGGTGT 
CTCCCTACTG 
ATTACAGAAG 
CCCCTCTCTG 
CCCGGCTGAA 



ACGTGAACAG 
GCGCAAAGAA 
ACAGGAGTAT 
GCAGCTGCTC 
CTCGCAGCAG 
GCCCAAAGCC 
TCGCTCCCCT 
CCAGGCAGGA 



GGTGAGTCTA 
GAGGCTCTTC 
AAAAGGCAAC 
CGGCTAGAAG 
CGAAGGAGAG 
GAAGAGGAGA 
ATCAGGCGAC 
CAGGAGCAGG 
CCGCCACCAC 



ATCAGCCAAA 
AGAAGAGAGG 
AGGAAGTGCC 
CTCTTCGCCG 



TGCTGGCAGA 
AGCAACAAAG 
AACAAGAAGA 
GGAGACGGGC 



GTTCTGTCCC 



GCCCGGGTCT 
ATCCAAGTCT 
TAAAAAGGAA 
CAAAGAGCTT 
ATCCAGTGAG 
TGACGAGTCC 
TGGTGAAACG 



CCCAGACCTG 
CACCCTGGGT 
GAAGGCTCTC 
GTTTTCAGAC 



ACTCCAGAAA 
GATTTCTCCA 
GAGACCAGAA 
TACCAACACT 
CTCTGAGATT 
CCTGATGCTG 
ATTTCAACAA 
GGATAAGTTA 
AGAAGTTGAG 
TAAAGTTGTA 
AGTCTATGCG 
ATTGGTACAT 
GATCTATGGA 



CACAAATCTT 
TCTAGCGGAA 
GCCATAAGGC 
AGGCCACAGA 
CTGTGTGCTG 
CTGGACAGAA 
ATG3ACGTAC 
CGTGTCTACT 



AAATATGAAA 
TGGGCACCAA 
AAGCCATTAC 

TCCTGTGCTG 



CCATGTTACT 
CGCAGCAGGA 
CTGCTGACCG 
GTCGAGATTC 
CCACCAGTAT 
GCAGTGGCAG 
CTCAGAGTGG 
CATCTCAGCG 
CCCTCAAGCC 
AAGATGTACG 
CGACGGATGA 
CAGAGGACAC 
AAACCATGAT 
GCACTCTAAT 
CCTCCTCCTT 
CAACAGTGAC 
AAGATCCTAC 
GTGACACCCC 
CCTTATGGGG 
GTGGCCAAGG 
TTGAGGGCTT 
ATTTGTCCTG 
GATGGACAAC 
GAATCAAATT 
AGCCATATCA 
TGGTGGATCT 



ACCCCATGCA 
AGATGAGGGG 
GGGAGAGATG 
GAAGGCCATA 
AAGGGCTCAA 
TCGGTCTGGT 
CTGGTAGAAG 
CTTGGAATXC 
GTGCAGACCT 
ATCCCCATTC 



GAGATCCGAT 
AGACTAAAAT 
GGCAGCAGTC 
CAGTGTGATC 
CTTGTAACTG 
CATGTGTTGG 



TAAGAAAGAA 
TCCCCAATAC 
ACACATATGG 
TAGCATATAT 
CTGTGGAAAC 
TCTTGTGTGA 
AGGTTTATTT 



GAGCTCGGAG 
GTTCTCTCCC 
TCTTACTCCA 



3300 
33S0 
3420 
3480 
3540 
3G00 
3660 
3720 
3780 
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Protein Accession #: NP_004825.1 



DEEEEIKLEI 
KGNTLKEDWI 
TVGRRNTFIG 
ALFLIPRNPP 



AYISREILRG 
TPYWMAPEVI 
PRLKSKKWSK 




NLSNGETESV 
RLLQISPSSG 
KRFNSEILCA 
SGKKDKLRVY 
SSVEVYAWAP 
SVYDIYLPTH 
VLQWGEMPTS 
FASVRSGGSS 



EDVRPPHKVT 
KTMIVHDDVE 
TTVTSWGFS 
ALWGVNLLVG 
YLSWLRNKIL 
KPYHKFMAFK 
VRKNPHSMIQ 
VAYIRSNQTM 
QVYFMTLGRT 



PAGIFELVEV VGNGTYGQVY KGRHVKTGQL 
NIATYYGAFI 
LAHLHIHHVI 
ACDENPDATY 

KFFSFIEGCL VKNYMQRPST 
EEVPEQEGEP 
LLAERQKRIE 
RRRAEEEKRR 
PQQERSKPSF 
STSIEPRLLW 
PSQRLENAVK 
TTDEEDDDVE 
GTLIVRQTQS 
QDPTRKGSW 
SGQGKVYPLI 
GWTTVGDLEG 
LVDLTVEEGQ 
LPNTDGMELL 



V3REQEYIRR 
HAPEPKAHYE 
ERVEKLVPRP 
KPEDKKEVFR 



AAIKVMDVTE 
GSITDLVKHT 
DFGVSAQLDR 
PPLCDMHPMR 
DQPNERQVRI 
TLRRDFLRLQ 
EQQRREREAR 
QLEEEQRHLE 
PADRAREVPV 
GSGSSSGSSN 
PLKPAGEVDL 



SEPAMTPSKE 
CDGMRPEAIR 
TESGLMLLDR 
HNDPEVEKKQ 
SFGELVHKPL 
CSIKPHAIII 



ASSTLQKHKS 
NVNPTNTRPQ 
NRRRFQQMDV 
CVHYKWKYE 
RLKVIYGSCA 



SSSFTPFIDP 
SDTPEIRKYK 
LEGLNVLVTI 
RIKFLVIALK 
GFHAVDVDSG 
NTYGRITKDV 



1020 
1080 
1140 



Seq ID NO: 90 DNA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 2-71 (underlined sequences correspond t 



start and stop codons) 



TTACACTTCA 
TCTCTCCT TA 
AAAACAAAAC 
TTGTTATGTT 
AAAAATCACC 
ATTAACATTT 
AATTATCCAC 
AAGCATTTGA 
CAGCACCAGC 
ATTAAGCAAG 
TCAGTTCTGA 
CACAATGGTC 
TCTGTGAATG 
CACCCTTATG 
ACATCTGACT 
TCCAGTGACC 
ATACATTTAT 
GTGATACACT 
CTTCATGTTG 



ATTCCTTACA 
AGAAAATGTT 
AACCCAACAA 



ATTTATCTTT 
TTGAACATGA 
TTAATTATTA 
AAGTTGAAGT 
AAGCCATAAT 
ATATAAGACT 
GGATGTTAAT 
TCACGTTCTC 
TCTTTTTAAT 
TTTATAATAA 
TACGGGAGCT 
CAATTCTAAA 
TAGCCTTAGC 
CCGCAGATCT 
CATAAATGTT 



CGGTATTTCA 
TATAAAGCTG 
CCTAGATAAC 
GGCCTACTCA 
CAGCTATTAA 
AAGGCAACTT 
CTTAATCTTA 
GGAATTTAGG 
CTTTATACCT 
ACCTGCCCAA 
CGTCAAATTT 
ACATTTGTTC 
TATGGGCATA 
TTTCTTGAGA 
GCAGGGAAGT 
GACCATAGCA 
AGGCAATAAA 
GAAATAGATG 
ATGAATATTA 



AACAAACAGT 
AAAGGAAATC 
TACAGTGATC 
AACAGCTAAA 
TCTTTTGAAT 
CTGCACAATC 
AAAAAAATTA 
AAAGCCATAA 
ATCAGTTCTA 
GAATTCAGTC 
TCTTTGGACT 
TCGCGAATAA 
ATTGTGCTTG 
ACAGCAAACT 



CCTGCAAGTG 
CCAAGAATCA 
TGTTCTCAGA 
ATAAAAAGTT 



TTTGCTGAGA 
AAACAGTAAT 
AGGGAGCACA 
TAACAACACC 
GAATAAACTG 
CTGTATCCAA 
GAACCCAGAA 
AAATATAAAT 
TTTCTATTAA 
TTTTTTCATT 
GCATTCCTCA 
ATTGATAAAA 
ACTGGATAAA 
GCATTTACCA 
TTCGAACGGC 
ACACAACAAG 
CTTTGAAGAC 
CAACAAAGTC 
GATTGAGA 



GGAGCTTTTG 
CTTAAAAATG 
GTTCAACTCC 
AGTGGCAGAT 
TGACAAACAA 
GCAAACTTTA 
CTTTTCAATG 
ACTGTTATCA 
CAGTAAAAAC 
TTTGTTTTTC 
CTACTTTTTG 
GGTGTTAAGT 
AACTTAAGTC 
TCGTAAAACA 
TCCTCAGAAA 
CAGATTTATT 
ACAGCAAAAA 
CCTTCAGAAT 



Seq ID NO: 92 E 

Nucleic Acid Accession #: NM_00370S.l 

Coding sequence: 310-1935 (underlined sequences correspond to start and stop codons) 
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GGCACCGAGG 
GACTACACCG 
CACCGCGCAT 
CAGTGCACCA. 
GCGGCCGTGG 
GCTGATGAGG 
GCCTGCCTTG 
CTCGCAGGGG 
ATGGAAGCTC 
AAGAGCCTAC 
TGGGCCTACA 
ATGAAGAAGC 
GACCTGCAAC 
CACCACGCTG 
TTCAAGAAGG 
TTATGGGGAA 
AGGAATCTGA 
CACCTTATTT 
CCAGAAGATG 
ACCAGGACCT 



GACCTCCTCC 
ATCCCCGCAG 
CTAAGCGCAG 



GGGTCCTGAG 
TCTCTGGATC 
TCGAGGCTGA 
AGAAAACCAT 
TGGTTATCTC 
CCGTGGAAGA 
CTTCCTGGCA 



CACAGGGGGA 
TCTTCACCCG 
GGAGCCCAAG 
CCCAGACCCA 
TGAAGTTTCC 
ACTTCATGTG 
TGCTGTGCTG 
TGAGATGAAA 



CCTGAAACAT 
CCAAGCAGCG 
TAAGCAAACC 



AAAGGAAAGC 
CGGAGATTCC 
CCTCCGCACC 
GGCTCAGGAG 
ATAATTCCTG 
CTGAAAGCTC 
GGCTCAGGCG 
GAACAGGGCC 
ATATCTTCTC 
CGATTTACCC 
AGGTCTGAGA 
AGAGAACTGC 



AGGAGGACCA 
GGCTGAAGGA 
GGATTCCGGA 
GACTGAGAAT 
GGCTCCAGAA 
TGAAGAAGCT 
GAGGACTGCG 



TCTACACCAA 
GACAGGAGTG 
ATTACTCTCT 
CGGAGTCTCA 
TATTTGCAGC 



GAAGACTGGT 
GTGCTCTTGG 
CCCTGAAAGG 
TTGCCCGATT 
AAGGCGGTGA 
CCCTGGAAAA 



ACTGGGGGCC 
CAGAACTCAC 
TAACACTGAA 
TTTATGGAGA 



GGGACCACTC 
CGGAAGCACC 
CTGCCCCCGA 
TTCGAGACCA 
GTAGAAGAGG 
GGAGAAACTG 
GATATTGAGG 
GATGTGGTGG 
ATCCTTAGAG 
TGCTGCTTGG 
GGCCCTGAAG 
CTTGAGTTCA 
TTTGATAGCG 
CAGGTCCCTC 



ACAACTTCCT 
TCCACCTGGT 
CGCGGGAGGT 
TCCGGGCTAC 
CTGAGCTGGA 
GACCAGTGGT 
CATGGAGTGA 



AGTTGATGAA 



TCCACCTGGC 
CCTTGGCTGT 
TCATTCAGAA 
GGGAGATCCT 



GCCTGAACAC 
GCAGGAGCAG 
TGTGAAGAAA 
GTACAAACAC 
GGATGCTGGT 
TCACCTCATC 
CACTGACTAC 

GATACATTTT 
CACATACGAC 
GGCATTAGCC 
CGTGGCCGGG 
CCTCAGCTTC 
CTTCCTGTTC 



TTTATGACTG CACTTCTAGC 
TGGGCTTAAA TCATGGGCTA 
GCTCAATAAA TGCTTGCTGA 
AAAAAAAAAA AAAAAAAAAA 



GAAGGTGTCC 
GGAGCTGGAG 
AATGAATGTC- 
CAGTAGCTCT 
TCTCTCCACA 
TTGACTGATG 
AAAAAAAAAA 



GCACCAGAGA 
TTTGTTTCCA 
CCTGAGAGAG 
GTCATTAGGG 
AGGGCTGTTG 
CAAGAAAGTT 
ACCTGGCTGA 
CCCCATGAGG 
ACAGGCATTT 
GGTGGCATCC 
TTAGCCATCA 
CTCTCCTTCG 
TGCCGCCGCC 
AAGGCCCCCG 
CCCCTGTTCA 
ACATTCAAGC 
AAGAAGAATG 
CTCTACTACC 
CAGGGCACTG 
TTCACTCCCT 
CCAATCACCA 
AAGGAGCTGA 
CATGAGTGTC 
ATAGTTCAGA 



TAACCCACTT 



AATACATTTT 
CTAATGCTAA 
CACAAGGGGA 
CTGAGATGCT 
ACCCCGAAAG 
GCGCTTCAAA 
GGGACAAGAT 
ACACTCCCTT 
ACTTCAGTGC 
ACAAGATCCC 



GCCAAGTGGA 
AAAAAAAAAA 
AAAA 



ACATAGATGC 
TTGCTGACAC 
TCAGGGAAAA 
CGAAGGATAG 
TGC-GCCTGTT 
TCAGCCACAC 
GTGACCAGCT 
AGGTGGTGAA 
TGACAATCAG 
CCAATGCCTT 
GCTCTGTAGA 
GCTCTGAGAA 
AAAAAAAAAA 



AGAAGAAAAG 
AAGGATTGAG 
GGCTCACATT 
CGTCACGTAC 
TGATGGTGAC 
GGACTTGGCT 
GACCGACTTC 
TTTGTCCAAT 
CATTGACAAT 
GTTCACCCCT 
CGGAAGCAAA 
CCTGAGAGGT 
TGACCAGTTA 
AAGCATTGGA 
ACATCCTCCC 
CGAGAATTGG 
GAAAGGCTCA 
GTGGGAATGG 
AATGAGCAGC 
CCCACTCGTG 
CGGAGATCCT 
CTTTCCCCAA 
CATCCTGAAA 
CTGTGGAGGT 
CTACACTCTA 
CAAGAAGAAG 
TGCCCGAAGT 



GCTTCATGGC 
AGACTGTGAT 
ATTTGTCCTG 

AAGCATCATG 
CCACTGCTCC 
AGTAAGAACT 
TACAACAAGT 
AAAAAAAAAA 



1020 
1080 
1140 
1200 



1500 
15S0 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 



11 



21 



31 



41 



51 



III] 
MGSSEVSIIP GLQKEEKAAV ERRRLHVLKA LKKLRIEADE APWAVLGSG GGLRAHIACL 
GVLSEMKEQG LLDAVTYLAG VSGSTWAISS LYTNDGDMEA LEADLKHRFT RQEWDLAKSL 
QKTIQAARSE NYSLTDFWAY MVISKQTREL PESHLSNMKK PVEEGTLPYP IFAAIDNDLQ 
PSWQEARAPE TWFEFTPHHA GFSALGAFVS ITHFGSKFKK GRLVRTHPER DLTFLRGLWG 
SALGNTEVIR EYIFDQLRNL TLKGLWRRAV ANAKSIGHLI FARLLRLQES SQGEHPPPED 
EGGEPEHTWIj TEMLENWTRT SLEKQEQPHE DPERKGSLSN LMDFVKKTGI CASKWEWGTT 
HNFLYKHGGI RDKIMSSRKH LHIiVDAGIiAI NTPFPLVLPP TREVHLILSF DFSAGDPFET 



Seq ID NO: 94 DMA s 



I 

AGGGAAAAAA 
TGAGAAAGAT 
AGTAAAACCC 
AAATGAGGGT 
CTCAAGTGTG 
GGACATGTCT 
TGGCTTTCTG 
CCCAGCAGCT 
CAGGGGCAAC 



11 
I 

ACTCCATTAA 
TTAGCAAAAT 
TCAGGCTGCT 
TTCTTAACTC 
TGTTACCTTT 
GTATGATCCA 
CCGCTTCATG 



GACGAGAAGA 



AAAGCCCAGC 
TCCACCGTAT 
GAAATTTCTA 
ACACTGAGAG 
CTTTACCAGC 
AGGTACCCAA 
TGCTTTGGAA 
GAATGAACTG 
AGATGTTGAA 



31 
I 

TTTCCTCCAT 
CTTTTGCCAG 
GGCTGTTAGG 
CGGAAAGGGG 
ATGGTAAGCA 
AGTCAGACAG 
AAAGCAGGAG 
CAAAGAGGGA 
GTGTGTGGTG 



AAGCCCCTCG 
CAGACCCTTT 
ACAGGACATA 
AGTAAACTCA 
AAGCAATAGC 
ACTGACAGCA 
GTGGGGGACG 



51 
I 

ACTTGGAAAA SO 

GGGAGAGCAG 120 

AATTCTGTGA ,180 

TCATAACTCC 240 

TCCCAGCCTC 3 00 

AGCCTGGCAC 3 SO 

AGCAGGAGTC 420 

GCTGCGGCTG 480 

GTGCCGTGGG 540 
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GAAAACCTGC 
TGTGTTTGAC 
GTATGACACC 
GGATGTGTTT 
GGAATGGGTC 
CCAGATTGAT 
ACCTCTCACT 
GGAATGTTCA 
CATTTTCCAC 
TATCTGAGGT 
TCTGTGGCCA 
GAGGCTTGCC 
ACGACCTCCC 
GATTGCAGAC 
TTACAAATCC 
AAGAAACTTG 
CAATACTAAC 
TTATTTCAAG 
AGCCTAAAGA 
TTAGAACCAA 
GTTTTGGAAA 
GTTGCAGGTC 
TTTTTCCATG 
TGTTTTTCAT 
TGTATTCAGT 
• CCTTGATTTC 
ATAAAAGAAA 
TGCCTTAATT 
CTCATTCCAT 
TTCCTTACTC 
TATCCATTTG 
AAAGCAAAAG 
CAGGAATTTT 
CTTCTAATTA 
TTTTTTTCAG 
ACTTTCCAGA 
CTGGAATAAG 
CAACACTTAG 
GGGTAATATA 
ACACATACCT 
AAGGTAAGGC 
AGAACCTGTC 
CCATCTAATT 
AGTTTTCAGT 
AATCAACTGT 



CTGCTGATGA 
CACTATGCAG 
GCGGGACAGG 
TTGATCTGCT 
CCCGAGCTCA 
CTCCGTGATG 
TACGAGCATG 



GCTACGCCAA 
TTACTGTGAC 
AGGACTACAA 



AGGACTGCAT 
ACCCAAAAAC 
GTGTGAAGCT 



CGACGCCTTC 
TGTGGGAGGC 
CCAGCTGAGG 
AAACCCTGCC 
GCCTCACGTG 
CTTGGCCCGT 



CCCAAGAAAA 
TGTCTGGGAC 
AGCTCCAGCC 
CCATCACCCT 
TGCCAGCCAG 
AGTGCCGCTG 
CCAGCTCATG 
ATTCCTCTAT 
CTTTTTTTCT 
AAATGTACTA 



AGAAGAAACG 
CTGCCTCCAC 
AAAAAGGAGG 
CTGAGCCCTC 
AAGCATCCGT 
CTGATCGCAT 
AACGTGAAGC 
TGCTGGCCTT 
GAATCTGCTG 
ATTTCCAGTT 
TATATTCATT 
TGAATGCCCG 
CATGATTTTT 
CCAAAGATAC 



CAAAGCGGTT 
CTGTTCTGAG 
CCCATCCAGG 
GCACGACCAG 
CCAACACAGC 
ACTGCACGCT 
CAAAAACAAA 
TGATAGGAAA 
ACTTGATGTC 
TTCTACCCAT 
CACTCAGGCC 
TCTATTTTCA 
ATTATAAGAA 
GAACCTAATT 



ATAGTTATAT 
CAGTATGTTT 
TAATGGGCCA 
ACTGGATAGA 
CTAATTACTG 



TATTGCCACG 
CCAAGTTGAT 
AAAGTGGACA 
ACTTTCTCTC 
GATCTGAGCA 
TTTCTATTTG 
GTAAATGACA 
ATCCTCCCAA 
TCCCCTGCCC 
CTCCCAGTAA 
TGTATCATAG 
GTCTTAGTTG 
TTGCATGAGC 
TTCAAQCTCC 
CACTTAAGAA 
GGAGGCTGAG 
GTGAGATCCT 
GTAGTCCCAG 
TGCAGTGAGC 
AAAAAAAAAA 
GCTAAAGATT 
CAAAGCTTGA 
GTATGAACTA 
TAGTCCTGTA 
CTTGCTAACG 



TGGGGGAAGA 
TTCCTGCTCC 
ATGCCCGGAT 
TAAAGATACG 
CTTCAACTGA 
TTTTCCAGTT 



AATGAACTGT 
ATATGGCAGG 
TCCCTTCCCC 
CCTTTTATTC 



TCCACAATGT 
GGAATCCTGT 
AGCGAATTAC 
CTTATAAGTG 
AAAGTGCTTC 
CACTGTTGGA 
TTGCGTGATA 
GTGGGTGGGC 
GTGTCTCTAT 



GTGACATAGA 
GCCCAATGAT 
TTCCTATCTT 
CCCTGGAATC 



TGTGACTGTG 
AAAAAACAAC 
TTCTTTCATA 
CATTTAGAGA 



CAGGAAGTAA 
GACTGTTGTT 
AATAAAACAA 



ATAAATGAAA 
GGCCACTCAT 
GCACTCAACA 
GTTGATCTTG 
AGTGTCTGGG 
TCTAGAACAA 



AAAAGCCAGC 
GCCAGGCACC 
CGCTTGAGCT 
AAAAAAATTA 
GGCTGAGGTG 
CCACTACACT 
CTACATTTCA 
CGCACACACT 
AAACAAGGAC 
GAGGTTATGA 
TGTTATTAGG 
TTCTCACTGA 
GGCAAATAGC 



CCAGAGGAAT 
AAGCAACACT 
CCACTCTCCT 
TCTTACCACA 
CCTTATGTCC 
TTGCTGTATA 
ATCGGAGCAC 
TTTGATGAAG 
GGTCACAGCT 
GATGAGAATG 
AAAGGAACTC 
ACACTAGTCA 
GTCTGAGAAT 
GTCAAAGGCC 
TCACCCCAGG 
TTTTATAAAA 
GTGTCTCACA 
TTACTAATCC 
GCATGTTTCT 
GACATGAGAA 
GAAAGAAAAC 
TC-CTTAGTAG 
GTTTAAGAAT 
ATCCATCTTT 
CATTGCATGC 
TCCCCAGTTA 
TAACTAAAAG 
CAGCTCGCCC 
CTGAGATTTT 
TCAGTGGATA 
ACAGGGACTT 
GTAAAACAAT 
TTCATTAGAG 
ACCCAGGTAG 
GAAATTACAA 
CTTTCTAATC 
GTGGCTCATG 
CAGGAGTTCA 
AAAATTAGTC 
GAAGGATCAC 
CCAGCCTGAG 
AGTACTATTT 
CCAGTGACTG 



ACGTGCCCAC 
TGCTCGGACT 
ACCCCAACAC 



TCATAGGGAC 
TGAAAGAGAA 
AGTGCTACTT 



GCTGTTCAAT 
GCAGCCAATC 
CCTTTGCACG 
GCCCACTGCC 
GCTGGGCCTG 
ATCTCACATT 
GAACCCGAAA 
CTTGGGACTA 
TTCATTTGTA 
ATACCAAATT 
ACCAAAGCTA 



GCACTTAATT 
CAACTTTAAG 
TCTTCTGCTA 
CCTGTAATCC 
AGACCAGCCT 
AGTTGTAGTG 



CAGCTTTGTT 
TGTGGATGAA 
TCCCGAGGTC 
CACGTCACCC 
GCTCCCAGGG 



1030 
1140 
1200 
1250 
1320 
1330 
1440 
1500 



CATCTGAATT 
TGGCCTGATT 
GTAGTATTTC 
TACAGTATTT 
CCTCTTCCTC 
TAACCCAGTT 



CATGTAAGTT 
GCCTCTCTCT 
CCAAGCTCTG 
TGGCCCTGGG 
TCCAAACATC 



CCCTTCTCTC 
GAAAAACGGG 
TATAAATGGA 
TACAAACAAT 
GCATGATTGC 
AAAATGAGAG 
ACCACTTCCC 
AGAACCAGCA 
GTCTTAAGAG 



2220 
2230 
2340 

2460 
2520 

2640 



3120 
3130 
3240 



3420 
3480 
3540 



3 CGCRGNDEKK MIiKCVWGDG AVGKTCLLMS YANDAFPEEY \ 
TVTVGGKQHL LGLYDTAGQE DYNQLRPLSY PNTDVFLICF SWNPASYHN VQEEWVPELK 
DCMPHVPYVL IGTQIDLRDD PKTLARLLYM KEKPLTYEHG VKLAKAIGAQ CYLECSALTQ 
KGLKAVFDEA ILTIFHPKKK KKRCSEGHSC CSII 



Seg ID NO: 96 DNA sequence 
Nucleic Acid Accession #: NM_003654.1 

Coding sequence: 367-1602 (underlined sequences correspond t 

1 11 21 31 41 51 

I 1 I 1 I I 

GGGGAGGGCG CGGGAGGCGG AGGATGCCGC CGCGGCTGCT GCCGCCGCCG CCACCCGCGG 

GTCCCCGGCG ACCCTACTCC AGACCCGAGG ATGGAGCCGG CGCTGGGCGC TGCAGCTGCT 

CCCGGCGCGT CCCCGACCAG GTAGCTGGTG TCACTTCGGT GTGGTTGGAA C-AAGACTTTC 

TCCCCAGCTG CATTCCCGGA GGCGCCCTTT CGACCTGGAG GCCGGGTCTG CTGGCCACAG 

GGCTGCCGCA CTGGCTGGGA CTGCCAGCTG GGCCTGGAGA CGCTGGTGGC TGTGGACTCC 

CCAGCTTGGA GCAGTCCCTC TTTGACCTCA CCCCTTGGAG AAGCAGCCCC ATGAAGGTGC 



start and stop codons) 
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CCAGCCATGC 
CAGTACACGG 
GAGGCCGGGC 
CGCAAGACCC 
CTCTTCAACC 
ACGCTCATCC 
GCCAGCCGCG 
ATCAAGCCGC 
GTCCTCTGCT 
GGGGACTGTG 
GAGCGCAGCC 
CTGGTGGAAG 
ATTCTGGCTT 



AATGTTCCTG 
CCATCCGCAC 
TGGCCGAGCG 
ACATCCTCAT 
AGCACCTGGA 



CGCCGGTCAA 
CCCGGCCTGT 
TGCGCAAGTG 
ACGTGGCCAT 
ACCCGCGATT 



GAAGGCCGTC 
CTTCACCGCC 
ACTGTGCGAG 
CCTGGCCACC 
CGTCTTCTAC 
CCAGGGCAAG 
GAGCCTCTAC 
CCACACCACC 
GTGCGACCCT 
CGGGCTACTC 



TTCTCCAACT 
TTGGTGCGCT 
TTCCTGGGCA 
GACCCCACCC 
AAGTGGCGCT 
GTGCTGGCCC 
GTCAGCCTGG 
GGGCGGGAGG 
TAACCCCTCC 
CTTCTGCCCC 
AGAGGGCGCC 
TCTCTGTGCG 
CGGGCACTCG 
TTTGCACTGT 
TTTATGAATG 
CCCTTAGAGC 
TTACTGTGAG 
GTCTGAGTCT 
TTGATCTCGG 
CAGTTTGCTA 
AGTGCAATAA 



GGAAACCCTA 
CCGTGTCCAC 
ACGAGGACCT 
TCCCGCTGGA 
TGGGCAAGCA 
TCCGCCTCTC 
AGCTGGGCTA 
TGGAGGAGCG 



AAACCTCAAG 
GACCTTCCGC 
CAACCTGGAC 
CGGCCTCATG 
GGCTCGGAAC 



CTCCTCCTTG 
AAGTCCTTTC 
GAGAGCCCCA 
ACGCGCAGCG 
CTGTTTGAGC 
AGCCCGGCCG 
GACTGCGACC 
GACAGGATCT 
CCGGGGCCAG 
AACCTGACCG 
CGCGTGCCCG 
GTCATCCAGC 
GACACGTACC 
GTGACGCAGC 



CCCTGGCCTC 
ACACCTGCCC 
CCTTCGCCTA 
GCTCCTCCTT 
CCCTCTACCA 
ACCGGCGGGT 
TCTACTTCCT 
TCCGCCGCGG 
CCGACCTGGT 



AGGTGAACGA 
TGGTCCGAGA 
GGCTCTGGCG 



CTACGACATC 
CAAGATCGCC 
GGACTTCCGC 
CGGTTTTGAT 



CTCTCCCACC 
TTTTTTGTCT 
TGAAGTAGGG 
GACGGTGACA 
CGAGGCGACT 
CTTACTATTC 
GTGTCCATCC 
AGCGAAACTG 
GTGAACGTGG 
CGTGGCCGCC 
GGTCCATCTG 
AATGAACATT 
TCACC 



AAGGTAAGAG 
TTTCCCCATC 
CCCCCTCCTG 
ACCTGTTTCT 
CCTGGACCAG 
TGATATTTCT 
GAAATTGAAA 



CCTATGAAGA 
GCCCGCTGGA 
ACCGTGCGAA 
GTGGCCTTTG 
GCCTCGGAGG 
CCCTTCTCGT. 
AAAATGGACC 
GTCCTTCCTG 
CACTACGTCT 
CCCACCCCAT 
GCACCACATT 
TTGAATGGGT 
GATACAAACA 
CCTGCCTCCT 



GGCTCAAGGG 
AGACCGAGGA 
TCCAGAACAA 
ACTCGGCGGC 
CCCAGAACGC 



CATTGCCATC 
CGGGCTGGCA 
CAACCTCTCC 
CGTGGGCCAG 
CGTCCAGAAC 
CATGCTAGGC 
GGAGAACTAC 
GGCCAGCCGG 
CCTGGAGGAG 
GGCGTGCCGC 
CCTGCGCGCC 
CCCCCGCGGC 
GCTCTGGTAC 
GTGCGAGGAC 
CAAGTACATG 



GTTTCCAGTC 
TGATGACTGA 
TTGTGCCAAA 
TGCTTTATCT 



GACCCGGGCG 
GTTTTTAACT 
CCCCCAGCTC 
TGGACGGGAA 
TCAGACACAT 
TACACATCCA 
GAGTGGTCGG 
AGAGGACCAC 
GCCCCTGACG 
CCTGTCGGTG 



CACGCGGGGC 
CACGGCCGAG 
CTGCCAGCAG 
GAACCCCTCG 
GTGCGGGTGG 



TGAATCTTAT 
AAGAAAAAAA 
GTGTTTTCTG 



TCACTGGGGC 
GGATGTTGGG 
CACACGCACA 
GTATCTAGTT 
TTGTCTCTAA 
CCCATTTCCC 
AGGCAGGTTT 
CTGTCTGTCT 
GAGCTTCTGA 
AAGAGTGGAT 
TAAATAAAAG 



1260 
1320 
1380 



2100 
21S0 
2220 
2280 
2340 
2400 



I I I I ! I 

MQCSWKAVLL LALASIAIQY TAIRTFTAKS FHTCPGLAEA GLAERLCEES PTFAYNLSRK 

THILILATTR SGSSFVGQLF NQHLDVFYIiF EPDYHVQNTI) IPRFTQGKSP ADRRVMLGAS 

RDLLRSLYDC DLYFLENYIK PPPVNHTTDR IFRRGASRVL CSRPVCDPPG PADLVLEEGD 

CVRKCGLLNL TVAAEACRER SHVAIKTVRV PEVNDLRALV EDPRLNLKVI QLWDPRGIL 

ASRSETFRDT YRLWRLWYGT GRKPYNLDVT QLTTVCEDFS NSVSTGLMRP PWLKGKYMLV 

RYEDLARNPM KKTEEIYGFL GIPLDSHVAR WIQNNTRGDP TLGKHKYGTV RNSAATAEKW 

RFRLSYDIVA FAONACQQVL AQBGYKIAAS EEELKNPSVS LVEERDFRPF S 



Seq ID NO: 98 DMA sequence 

Nucleic Acid Accession #: NM_002852.1 

Coding sequence: 68-1213 (underlined £ 



3 start and stop codons) 



I 

CTCAAACTCA 
TCCAGCAATG 
GAACTCGGAT 
CCATCCCACT 
GCTCTTCATC 
CGACGTCCTG 
CCTGGCGAGG 
CGAGCTGCTG 
GGCGCAGCGC 
GACGCGAGCC 
TTGTGAAACA 
AGTGAGACCA 
ATTAAACAAA 
GTATCTCAGC 
TGAAGCCATG 
AGGGCTCACA 
AGGTCACATT 
TGTGGGTGGT 
CTGGGATAGT 



GCTCACTTGA 
CATCTCCTTG 
GATTATGATC 
GAGGACCCCA 
ATGCTGGAGA 
CGGGGCGAGC 
CCGTGCGCGC 
CAGGCGACCC 
CCAGAGGAGG 
GACCTGCACG 
GCTATTTTAT 
ATGAGGCTTG 
ACCATCCTGT 
TACCAATCCA 
GTTTCCCTGG 
TCCTTGTGGG 
GTTCCTGAGG 
GGCTTTGATG 
GTTCTTAGCA 



; CCGCCAGCTG 
r TTGTGCTCTC 
WVTTTGGAC 
\ CTGCGGTCAG 



TCCCAATGCG 
AGTCTTTTAG 
TTTCCTATGG 
TAGTGTTTGT 
GAAGGTGGAC 
TAAATGGTGA 
GAGGAATCCT 
AAACATTAGC 
ATGAAGAGAT 



GCGGGAGGAG 
CGCAGAGGCC 
CCGCAGGCTG 
CCTGGCCGCG 
CTGGGCTGCC 
TTCCAAGAAG 
TGCCTGCATT 
CACAAAGAGG 



41 

TGGAAAGAAC 
TGGTCTGCAG 
AACGAAATAG 
GAGCACTCGG 
ATGCTGCTGC 
CTGGGCCGGC 
AGGCTGACCA 
GCGCGTATGG 
GTGCTAGAGG 
CGGAGCTGGC 
ATTTTTGGAA 
TGGGTCAAAG 
AATCCATATG 
GAGGAGAACA 



51 

I 

TTTGCGTCTC 
TGTTGGCCGA 
ACAATGGACT 
AATGGGACAA 
AAGCCACGGA 
TCGCGGAAAG 
GTGCTCTGGA 



AGCTGCGGCA 540 

TGCCGGCAGG 600 

GCGTGCATCC 6S0 

CCACAGATGT 720 



CCACCTGTGC GGCACCTGGA ATTCAGAGGA 900 

ACCACTGTTG AGATGGCCAC 9S0 

CAAGAAAAGA ATGGCTGCTG 1020 

AGACTCACAG GCTTCAATAT 1080 

GGAGGAGCAG AGTCTTGTCA 1140 



GCAGATTGGC 
CTTCTCTGGG 
AAGAGAGACC 
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CATCCGGGGG 
GTATGTTTCA 
AACACATGCC 
TGAAAGAGAG 
AAGGAAAGAC 
TTTCAGTTTA 
TGTTGAACAG 
GAATTTTACA 
TATGTACCTT 
ACTATAAATG 
AAGTTATATT 
AAATAAAATA 



AATATTGTTG 
TAAA TGTTGT 
AGTTGGGAAG 
AGTTGAGACC AATCTTTATT 
ATTGGAAAAA GCTTTTGAGG 
TCTGTCAGAT 
GTTTTACTTT 
AACAAAATAA 
AAATGATGAA 
GTTATAATCG 
TTTGTATTAA 
CTAAAAAAAA 



AGGGACAATT 
TTGGAAGAAT 
ATTACAAAAA 
TAGTTTATGT 
GCAAAAGGGA 
TTTTATAAAA 



CACAGAGATC 
TTGAAGCCAA 
TCAGTGCATA 
TGTACTGGCC 
ATAATGTTAC 
AAACTCTCAA 
TCTTTGGTTA 
GATTTGTTGT 
AACATATTTA 
AATGTCACGT 
TTTAAGACTA 



CAGCCACATG 
AGAAAGAAAC 
ATAGGAACAC 
AAATACTGAA 
TAGACTTTAT 
ATAATTAAAA 
ATTTTGTTTT 



TACTACAAGG 
TTTTGAGAAG 
TTTTTGTAAA 



GAGGAGCTCA 
TCACACTTAA 
TTGAGACTAA 
TAAACAGTTG 
GCCATGGTGC 
AGGACTGTAT 
GGCCAGAGAT 
TTGTTATTGG 
TGACTTAACA 
ATAGTCATAT 
GCTCTACTGT 



21 



41 



51 



I 1 I I 1 

MHLLAILFCA LWSAVLAENS DDYDLMYVNL DNEIDNGLHP TEDPTPCDCG QEHSEWDKLF 
IMLENSQMRE RMLLQATDDV LRGELQRLRE ELGRLAESLA RPCAPGAPAE ARLTSALDEL 
LQATRDAGRR LARMEGAEAQ RPEEAGRALA AVLEELRQTR ADLHAVQGWA ARSWLPAGCE 
TAILFPMRSK KIFGSVHPVR PMRLESFSAC IWVKATDVLN KTIIiFSYGTK RNPYEIQLYL 
SYQSIVFWG GEENKLVAEA MVSLGRWTHL CGTWNSEEGL TSLWVNGELA ATTVEMATGH 
IVPEGGILQI GQEKNGCCVG GGFDETLAFS GRLTGFNIWD S 
GNIVGWGVTE IQPHGGAQYV S 

Seq ID NO: 100 DNA sequence 
Nucleic Acid Accession #: NM_007351.1 

Coding sequence: 72-37S8 (underlined sequences correspond t 



D SVLSNEEIRE TGGAESCHIR 36 0 



start and stop codons) 



CTGCTATCAA 
AAACTACTGA 
GCATTGGGCT 
AGACTATGCC 
CCACTCGGGT 



ATCAAACTCT 
TCCCAACCAA 
CTACACTGAA 
ACACAGTTGG 
GAGCCCCACG 
ACCAAAAATC 
GGTTATCTCC 
CTTGTGGCTG 
ATAGGATGCA 
GGCCGAAATG 
AAAGT CAT AC 
ACCCAGAAGT 
TTCTGCAGAA 
CCTCCCTAGA 
GTCTAAAATC 
TTTTTCAAAA 
CAGAGGACCT 
TAGCAGCCCA 
TGGAACTAAG 
CTATTAAAGA 
CAAGAAGCAT 
ATGAGCAGCT 
CAGTTAGCAA 
GTTTGATGAT 
TCACCGTCTC 
CCAAATGCAG 
TAAATCAAAC 
AGCAACTAAA 
CATCACTCAG 
AGATAGAAAA 
AAAGACACAA 
TCAATGAATA 
ATGCTATTGA 
ATAATAGTGA 
CTCAGTTCCA 



AAAGGCCATA 
GATGAAGGGG 
TAACAACAGT 
TTCTGCTTCA 
CATGTCGGCG 
ATCAACACTG 
CACATCCACA 
CGCTAGCATC 
ATTTCTTCAG 
AGGCACTGGA 
GGAAACATAC 
AAATTTCGAA 
CACAGTGACA 
GACCGGTGGA 
ACATAAAATT 
TCAACTAAGA 
AGCTGTTGGC 
GATGCAAAAA 
GAAGATTGAC 
AGGAAAAGTC 
CAAAAGCATT 
TGACATGCAA 
CGAAAGCACC 
GCAAAAGTTT 
GAATCACATT 
ACTAGAAGTA 
TCTGTATTAT 
TTTATCAACT 
TAATGTCACT 



AGGATTTTGT 
GCAAGATTAT 
AAGCATTCTT 
GTTCCTCCAA 
GAGATAGCTA 
CCTCCCTCAG 
GAGAAAGCAG 
AAGTTCAATC 
AGCTTTGCCA 



ATAAAATACA 
CAACTCCAGA 
AAACAAGTGC 
AAGGAGTGGT 



I 

' CACATGAGCT 
' TTCTAGTTTA 
TGAGGATGGG 
AAGTTTGCAA 
GGCAAGAACT 
ACCTGCTGAG 



ACCTTGCTTC 
TGGAGTGGGG 

AACTCTCAGA 
ATACTGCCAA 



CTCAGCCGGG 
ACAACTAGAG 
TTGGACAACC 
TCCTGTCCTC 
GTCACCTCAT 



ATGACTGATC 
AATATTTCTT 
AGCGAAGATA 



GAAAGTCAAA 
GCGTTGGAGG 
GTGACAGCAG 
GAAAGAATTG 
AGGTCACTTA 
AGAGATCTCA 
TGGATTGGAG 
AGCAAAGTTT 
CTGAGCAGCA 
AGGTGAACTA 
TGACTGTGAA 
AAAGCAGAGA 
TAAGAGACAT 



ATCAGTGGTC 
TGAACAAGCA 
CACTGGAGGC 



AGGCAAATAA TTCAAAAAGT 
GTTTTGGTGC AAGAGAATCG 
GTGAATGTAA GGCAAGAAAT 
AAGCAGACTC ATTTAGAAGG 
GAATCCCTCA ATAAAACTCT 
GAACAGGTAT CAGACCAGAA 



GTGTGCTTAT 
TGTCCCAGGT 
GAAGATATCC 
GTGCTGTCCT 
GATACACACC 
GCAGCAGCAA 
CCAGGCAATG 
TGATGTAAGG 
ATTTCAATCT 
AGTAAGAGAA 
CAAGACTGTA 
TAATGAATCT 
GCCCACTTTG 
GACTCTTACA 
TGCTCTAGAA 
TTCTAAATTG 
GAATGCTCCA 



AATCTTACCC 
CTTTCCAATT 
ACTTCTCTAA 
GTGGGAAATC 
AGAACTGACT 
GTACATACCA 
GGGAAAGGAC 
AATCCTGTCT 
GGATACAGTG 
AACCAGGCTG 
GGCTGTGGTG 
AAACTGACTC 
AACACTTACT 
CTTCTAAAAG 
CAATTTAAAA 
TCAAGTCTAT 
GTGGTTTCAA 
ACTGATATAG 
TGTGAGAAGC 
CAGGAACACT 
AAGGAAGTAC 



TTTGAAGATT 
TTTGGAGATG GAGAAAGAGT 
AAATGATTTT AAATTTCAAC 
ATTGGCTGAA 
TGATTTGACT 
ACAGACAATG 
TCTGACTAGT 
CTTACTTAGA 
TGCCTTAGAA 
TTTCATTCAA 
GATCCATCAT 
CCGTCTGAAT 



TATGATATGG 
ACATATGAAC 
GCTGTCAATA 
AATGAAGTAC 
ATGGAAGATG 
GATAACTATG 
AAATGTACCT 
GATTCTATTC 



TGCACATTCA 
CTCTCAGAGG 
TTAAGGACAC 
CAATGGACAA 
AGATCCTTCA 
AACCAAAGGA 
GTCTAAATTT 
AGGGTCGTGA 
GCCTCAATAA 
CCCTAAAAGA 
CCGATATGGA 
AGACTTTGGT 



AGAAAGCAAG 



AGAAGAGAAT 
TAAGATGGAC 
ACCCTTGCTT 
AGCAATAGTG 
TATTATCAAA 
TGATGCCTTA 
GACAATGACT 
GACT'TTAAGT 
AACTATTTTG 
CAATGACAAT 



AAGAAGCAGA 
ATTAACAATC 
GACATGTTAT 
TTACATGTGT 
AAAATGAGTG 
GAGCAGGGAG 
ATAAGGAAAA 
GAACTTACAA 
GAAAGACGTA 
ATTATAAATA 
ACTATTAAGG 
ACATTTATTC 
CAGAGATATA 



1800 
1860 
1920 
1980 
2040 
2100 
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ACTTTGTTTT 
AGTCCAACTT 
ACCAGCAAAA 
ATTTTGAGAC 
ATATTTCAGT 
AAGTATTAAA 
TCTTTTCGCT 
GTGTGTCAGA 
AACTTCTTCA 
CTGCCCTATC 
TTGTCAAGTC 
CAACGGTAAA 
TATATCCTGA 
TAAATGGAAG 
CTATCAAGCT 
ATGCACCCAT 
TCCTGTTTAA 
TTAGAATTCC 
ATATTTCTGG 
TTAACAGTGA 
ATGGGCAGGA 
TTACTACATT 
CACCTTTATT 
TTGGTTTTTC 
TCTATTTTAT 
CTAAAGAAAT 
TTATTTCTTC 
ATATTATCAG 
TAAATATATA 



GCAAGTCGCC 
CCAAAAGATG 
TATGAGTCAT 
TCGGTTGCAA 
TAAAAAAGGC 
TTCCAGATTT 
TAACAAAACT 
ACTGAATGCT 
GAAAGGTCTA 
TAATTCAACT 
TCAGAAGCAA 



\ AACTACTCTT 



AAGACCCTTG CAGGTATTCC CAGAGATGAG 
TATCAAATGT 
TTGGAAGAAA 
GACATTGAGT 
AGTGTAGTTA 
AAGGCGTTGG 
CTCCACGAAG 



GGAGTATTCA 
AACTAGCTTT 
TGTGGAAGAA 



TAACTTGGAT 
GTATCTTGGA 
ATTTTTAGTG 
AATACACTGT 
AGTCTGGTTA 
TAGTGGCTAT 
GAGAAACAGC 
TACAGGAAAT 
AAAATTATTT 
TTAGTGGCAC 
ATTTTAAGTC 
TCACAGTTTT 
ACACACATTT 



ACAGAATTTG 
TGTTGTATAG 
GTAAAATCAT 
GTCCTGATAG 

AGCTGTAGTC 
ACCTGTGCCT 
AATGCTTTAG 
TTTGCATCTC 
GTCAATTATG 
GTATATGTTT 
GTTGATGGAA 
GATAGGGTTT 
CGACTTGCAA 
TTATTATATC 
CAGTGTTTTC 
GAAAATCAAC 
GAATATTGTT 



CAAATGAGAG 
AAGCAAAATC 
TTTTAACAAT 
AGTGGATAAA 
TGGAACCAAT 
ATCGATCGTT 
TGCCAAAGAA 
GCCGGACTCA 



GCAGACATCC 
CTCCAGATTT 
ATACGTATGG 
GAGCTTCATA 
TCAAGTACAC 
TAGACAAGCT 
TAACTGGGGA 
AAGGAACAAT 
GTACATAAGT 
ATTTATCTTT 



ATTGCAATGG 
CTTTCCAATT 
TCTAGATTCA 



TAATGTCTGA 
GTGAATTTGT 
AAAGTAATAT 
AAACACTTAA 
CAAATTTAAA 



AGATCAGGCT 
TATCCATCTT 
GTGTCACAAT 
ACATTCCCTG 
AATTCAAATA 
GCCTGGTAGT 
AATTAACGCA 
AAGAAACACG 
CCAAAATGGG 
TTTTACTGGT 
TTCCAAAGGA 
AATGACTATA 
TACCCCAAGA 
CATCGAGTCA 
TGCATTTGAG 
TGCCTTATTA 
TCCAGCCAAG 
TAGTATGAAA 
GCTTGCACAT 
AATATGAGTA 
ATATGAAAGA 
TAGCATAATT 
TATAAAACGG 
CTTTTGTTAT 
TAAATTACTC 



AAACTAAATC 
GTGAGAAAAT 
ATTTCCAAAA 
ATACCTTATT 
CTTCAACTGC 



CCAGATATTC 
AAAAC TCAAG 
CTGGCAAATG 
CTTAAGAAAC 
GACAACATAA 



GACAACTGCA 
TCTTACAGAT 
CCTGGTCCTA 
ACTGGAAAAT 
TTTAGTGCTC 
TCTGAAAATA 
GAATTAAATT 
TTTCCCCCTG 
AACAGACTAT 
CTGCTCTGTT 
AACTTGTATG 
GTTCTTGATC 
ATTCCTATTC 
TAATTACAAC 
TCCCTGTATA 
AAAAAATG 



30S0 
3120 
3180 
3240 



35 | 



MKGARLFVLL 
MSAEIATTPE 
ASIKPNPGAE 
ETYLSRGDSS 
TGGSCPQRSQ 
AVGRGVAEQQ 
GKVSEDKSRE 
ESTRQIIQKV 
LEVKQTHLEG 
NVTEYMSTLH 
NDFKFQLKDT 
QTMTYEQPKE 
ALEMEDGLNK 
RLNDSIQTLV 
MSHLEEKLLL 
SRFKALEAKS 
KGLTEFVEPI 
LTTVLIGRTQ 



SSLWSGGIGL NIJSKHSWTIP 
ARTSEDSLLK STLPPSETSA 
SWLSNSTLK FLQSFARKSN 



I 



KISNPVYRMQ 
QQQGCGDPEV 
FQSLLKGLKS 
NESWSIAAQ 
ALEQEHSRSI 
ENIKKQSLMM 
EENLHVLNQT 
AIVIRKKIEN 
TMTIINNAID 
NDNQRYNFVL 
TTKISKNFET 



HKIVTSLDWR 
MQKMTDQVNY 
KSINVLIRDI 
QKFVLVQENR 
LYYESLNKTL 
LQMFEDLHIQ 
LAEVLFPMDN 



PAEGVRNQTL 
EQATSLNTVG 
CAYVHTRLSP 
CCPGYSGPKC 
QAMKLTLLQK 
VREQFKIFQN 



I 

SASVPPNKIQ 
TSTEKAEGW 
GTGGIGGVGG 
TVTLDNQVTY 
QLRAQEQQSL 



SKLPCEVHEQL 
ESKINNLTVS 
KMDKMSEQLN 
IIKELTKRHN 
TLSTIKDNSE 



NKTLHEVLTM 
NSTCCIDRSL 
EYSSCSRHPC 
VAFFASHTYG 
FliWDGIDKL 
SGYLLYRT 



QTLIPYYISV 
CHNASTSVSE 
PGSLANWKS 
QNGGTCINGR 
MTIPGPILFN 



DLTYDMEILQ 
LLRNEVQGRD 
IHHKCTSDME 
QKMYQMFNET 



LNATIPKWIK 
QKQVKSLPKK 
TSFTCACRHP 
NLDVNYGASY 
IHCDRVLTGD 



51 
I 

SLQILPTTRV £0 

KLQNLTLPTN 120 

TGGVGNRAPR 180 

VPGGKGPCGW 24 0 

IHTNQAESHT 300. 

DVRNTYSSIjE 360 

KTVSSLSEDL 42 0 

TLTCEKPIKE 480 

NAPAAESVSN 54 0 

ECEDMLSKCR 600 

PLLEQGASLR 66 0 

DALERRINEY 72 0 

TILTFIPQFH 780 

TSQVRKYQQN 840 

DQALQLQVLN 900 

HELPDIQLLQ 960 

INALKKPTVN 102 0 
1080 
1140 

ALLELNYGQE 12 00 



Seq ID NO: 102 DNA sequence 

Nucleic Acid Accession #: NM_000873.2 

Coding sequence: 57-884 (underlined 



correspond to e 



; and stop codons) 



I 

ATCTCCCTCC 
CCTCTTTCGG 
CGGATGAGAA 
GGTCCCTCGA 
CCTCTCTAAA 
ACATCTCCCA 
TGAATTCCAA 
CTTTGGTGGC 



GGTCAACTGC 
TAAGATTCTG 
TGACACGGTC 
CGTCAGCGTG 
TGTGGGCAAG 
CACCCTCTTC 
CCCTGCTCCG 
CCGCAACTTC 
CAAACACTCA 



21 
I 

C TGGCTGGTCC 
: CTGACTGTGG 
GTACACGTGA 
AGCACCACCT 
CTGGACGAAC 
CTCCAATGCC 
TACCAGCCTC 
TCCTTCACCA 
CTGTTCCGTG 
CAGGAGGCCA 
TCCTGCCTGG 
GCCCCGAAGA 



CTGCGAGCCC 
CCCTCTTCAC 
GGCCAAAGAA 
GTAACCAGCC 
AGGCTCAGTG 
ACTTCACCTG 
CAAGGCAGGT 
TTGAGTGCAG 
GCAATGAGAC 
CAGCCACATT 
CTGTGCTGGA 
TGTTGGAGAT 



GTGGAGACTG 



I 

CCAGAGATGT 
TGTCCAGGAT 
GAGCCCAAAG 
GGTCTGGAGA 
TTGGTCTCAA 
CAGGAGTCAA 
CTGCAACCCA 
GTGGAGCCCC 
GAGACCTTCG 
GCTGACAGAG 



TGAAGTGGGT G 
GAAACATTAC 1 
CTCCGGGAAG C 
CATCCTGACA C 
GGTGCCCACC G 
TCTGCACTAT G 
CAACAGCACG G 
CTTGATGTCT C 
CTATGAGCCT GTGTCGGACA 720 



228 
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GCCAGATGGT CATCATAGTC ACGGTGGTGT CGGTGTTGCT GTCCCT3TTC GTGACATCTG 
TCCTGCTCTG CTTCATCTTC GGCCAGCACT TGCGCCAGCA GCGGATGGGC ACCTACGGGG 
TGCGAGCGGC TTGGAGGAGG CTGCCCCAGG CCTTCCGC-CC ATAGCAACCA TGAGTGGCAT 
GGCCACCACC ACGGTGGTCA CTGGAACTCA GTGTGACTCC TCAGGGTTGA GGTCCAGCCC 
TGGCTGAAGG ACTGTGACAG GCAGCAGAGA CTTGGGACAT TGCCTTTTCT AGCCCGAATA 
CAAACACCTG GACTT 



21 



31 



41 



51 



I I I I I I 

MSSFGYRTLT VALFTLICCP GSDEKVFEVH VRPKKLAVEP KGSLEVNCST TCNQPEVGGL 
ETSLNKILLD EQAQWKHYLV SNISHDTVLQ CHFTCSGKQE SMNSNVSVYQ PPRQVILTLQ 
PTLVAVGKSF TIECRVPTVE PLDSLTLFLF RGNETLHYET FGKAAPAPQE ATATFNSTAD 
REDGHRNFSC LAVLDLMSRG GNIFHKHSAP KMLEIYEPVS DSQMVIIVTV VSVLLSLFVT 
SVLLCFIFGQ HLRQQRMGTY GVRAAWRRLP QAFRP 

Seq ID NO: 104 DNA sequence 
Nucleic Acid Accession #: NM_001795.2 

Coding sequence: 121-2475 (underlined sequences correspond t 



start and stop codons) 



GACGGTCGGC 
AGTCCAACGG 
ATGCAGAGGC 
GCAGCAGTGG 
ACCCACCGGC 
AACACCTCAC 
AAGTACCTGC 
GACGTGTTCG 
GTCATTGTGG 
GTTCATGACG 
CCTGAGTCGT 
CCCACTGTGG 
GCCATCGATA 
GCCAGGTATG 
ACGGCCACCG 
ACCAAGTACA 
TTTGTTGAGG 
GACTACCAGG 
CCCATGAAGC 
GACCCCACCA 



TGACAGGCTC 
AACAGAAACA 
TCATGATGCT 
CAGCAGCAGG 
GCCAAAAGAG 
TTCCCCATCA 
TCAAAGGAGA 
CCATTGAGAG 
ACAAGGACAC 
TGAACGACAA 
CGGCTGTGGG 
GAGACCACGC 
ATTCTGGACG 
AGATCGTGGT 
TGCTGGTCAC 
CATTTGTCGT 
ACCCAGATGA 
ACGCTTTCAC 
CTCTGGATTA 



GACAGAGCTC 
TCCCTCAGCC 
CCTCGCCACA 
TGCTAACCCT 
AGATTGGATT 
TGTAGGCAAG 



I 



I 



GCTGGACCGG 
TGGTGAAAAC 
CTGGCCTGTG 
GACCTCAGTC 
CTCIGTCATG 



CTGAAGGAAA 
GCTAGGCATA 
GTCACAAAAA 
TATAACCTGA 
TCCATTGTGC 
AAGCCCTACC 
TCCGCAATAG 
GAGAACAACT 
GGGCAGTTTG 
GGGATGCCAA 



CAGATGTGGA 
ACCAGAAGAA 
GCATTGGATA 
AGGGGGACAT 
CTGTGGAGGC 
AAGTCCACAT 
AGCCCAAAGT 
ACAAGGACAT 
TTACCCTCAC 



GGAAGCGCGA 
TCTGCAAGAC 
GCCTGAAGAC 
GCCCCAGAAC 
CATTGAGACA 
TGAATACATC 
ATACATGAGC 
CGAGCCCCCC 
GCCTCTGATT 



CCACAGGCAC 
TCGGGCGCCT 
GCCCAACGGG 
TGGAACCAGA 
ATCAAGTCAA 
AAGGTCTTCC 
GAGAATATCT 
CTGGAGACTC 
TTCACGCATC 
ATCTCTGTGA 
TACCAAATCC 
ATAACGAAAA 
GATGCCCAGG 
ATCAATGACA 



GATCTGTTCC 
GCCTGGGCCT 
ACACCCACAG 
TGCACATTGA 



CAGAGTACCA 
CTTCCAGCTT 
GGTTGTTCAA 
CAGCAGTGGA 
TGAAGGGGAA 
GCTTGGACCG 
GCCTCCGGGG 



TCCTGGGAAG 
GCTGGCAGTG 
CCTGCTGCCC 
TGAAGAGAAA 
CAAGAATGCC 
AGAGACAGGA 
CCTCACTGCT 
CACCATCAAA 
TGCGTCCGTG 
TGCAGACGAC 
AGAGTATTTT 



CGGATGACCA 
AACCCCGCCC 
CAGCAATACA 
CCTCCCGCGG 
ATTTTCCAGC 
GGCACAGTGC 



GCACCTCTGT 
AGTACAGCAT 
ACAACGAGGG 
GCTTCATCGT 
GAAACAGAGC 
AGCCTTTCTA 
TGGCCATGGA 



GGACTCGGGC 
CTTCACCCAG 
GGGCTCTCTG 
CTTGCGGGGC 
CATCATCAAG 
CGAGGCCACA 
CCAGGTCATT 
CCACTTCCAG 



TTACAATGAG 
CAAAGAACTG 
TGAAGTTTTG 
GTGTGAGAAC 
AACACCACGA 
GGATAATCAC 



GTGGTAGCCA 
CGGCGGCGGC 
CAGCTGGTCA 
TCGGTGCTCA 
CGGCCTTCCC 
GGGCCCGGGG 
GACGGCCCCC 
GAGTCCCTCA 
AACGACTGGG 
GAGCTGCTGT 
CCAGGCCAGT 
GCACCCCTTC 
CCTGAAATAT 
CTGGTGTTCT- 
ACTCAAAGAC 
TGCAGGGTCT 
TATCTGCCTG 
CATTCCCAAG 



GTCGCACGGG 
TCACCTTCTG 
TCTTACTCTG 



CCTACGACGA 
ACTCGGTGCG 
TCTATGCGCA 



CACCAGCACG 
CGAGGATATG 
CATCCTCACC 
GGCCCGCGCG 
GGAGGGCGGC 



CCTACGACAC 
GCTCCCTGGG 
GACCCAGGTT 
ATTAGGCGGC 



CTCGTGGGTC 
CCAGGAATAT 
GTCTGGGCTC 
TTCCTCTGGC 
TTTTCTAGGG 
GAGGCAAAGG 
GGAGACTGAC 



CATGATCGAG 
GCTGCACATC 
CACCGACTCA 
TAAGATGCTG 
CGAGGTCACT 
GCACCACAGC 
CCAGAGACCT 
ATGT C AGTGA 
AGACATCCAC 
TCCCCAAGGC 
TCCCTGAACG 
CCTGGACAGC 
CATCATGCCC 



AAAGAACTGG 
GATTCCACTG 
GATGAGAATG 



GGCGAGATGG 
GCCAAGCCCC 
CCACCGAGGC 
GTGAAGAAGG 
TACGGCTACG 
TCCGACTCTG 
GCTGAGCTGT 
CTGGGCCTGG 
CTCCAAAAAT 
CATCAGCCTT 
TGACTATTCT 
ATAACCCTGT 
TGCAAAGCAA 
CCCTGGTAAG 

TCTCTCGGGA 



ACAGAGAAGT 
GAACCCCCAC 
ACAATGCCCC 
GCCAGCTGGT 
TCAAATTCAC 
CCAACATCAC 
CCGTGGTCAT 
CCGTGTGCAA 
TGGGCGTGAG 
TCACCCTGCT 
GCGTGCCGGA 
ACACCACCAG 
CGCGGCCCGC 
ACGCGCCTGG 
ACGAGGCGGA 
AGGGCTCCGA 
ACGTGGATTA 
ACGGCTCGGA 
GGACCCAAAC 
GGCAGTGACT 
GGGATAGCAA 
CAAATGCTGG 
CACCCACAGA 
AACAGACTGT 
GCTGGTGAGG 
GGGCAGGATT 
GCCCTAGCCC 



GTTCTTCCGA 
CTACCCCTGG 
AGGAAAAGAA 
GGAGTTTGCC 
CCTGCAGATC 
CTTGAATACT 
AGTCAAGTAT 
CTCAGACAAT 



CTACGATGTG 
GCTGGACGCC 
GGCACACGGA 
CCACGACGGC 
GTCCATAGCC 
CGACTTCCTT 



CCCCTGCAGC 
CCCCAGCCCA 
ACTCCAGGTT 
CAAATCCAGG 
CCGCCGTCTA 
GTTTAACTGC 
TCCTGGTGCC 
CTCTGCAGCC 
TGCTCCAACT 



1140 
12 00 
12S0 
1320 
1380 
1440 
1500 
15S0 
1620 
1680 
1740 



2100 
21S0 
2220 
22S0 
2340 
2400 
24S0 
2520 
2580 
2640 
2700 
2760 
2820 
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CCATACTCCA CTCCAAGTGC CCCACCACTC CCCAACCCCT CTCCAGGCCT GTCAAGAGGG 3 0 SO 

AGGAAGGGGC CCCATGGCAG CTCCTGACCT TGGGTCCTGA AGTGACCTCA CTGGCCTGCC 3120 

ATGCCAGTAA CTGTGCTGTA CTGAGCACTG AACCACATTC AC-GGAAATGG CTTATTAAAC 3180 

TTTGAAGCAA CTGTGAATTC ATTCTGGAGG GGCAGTGGAG ATCAGGAGTG ACAGATCACA 3240 

GGGTGAGGGC CACCTCCACA CCCACCCCCT CTGGAGAAGG CCTGGAAGAG CTGAGACCTT 33 00 

GCTTTGAGAC TCCTCAGCAC CCCTCCAGTT TTGCCTGAGA AGGGGCAGAT GTTCCCGGAG 3 3 SO 

CAGAAGACGT CTCCCCTTCT CTGCCTCACC TGGTCGCCAA TCCATGCTCT CTTTCTTTTC 3420 

TCTGTCTACT CCTTATCCCT TGGTTTAGAG GAACCCAAGA TGTGGCCTTT AGCAAAACTG 3480 

GACAATGTCC AAACCCACTC ATGACTGCAT GACGGAGCCG AGCCATGTGT CTTTACACCT 3540 

CGCTGTTGTC ACATCTCAGG GAACTGACCC TCAGGCACAC CTTGCAGAAG GCAAGGCCCT 3600 

GCCCTGCCCA ACCTCTGTGG TCACCCATGC ATCTTCCACT GGAACGTTTC ACTGCAAACA 3SS0 

CACCTTGGAG AAGTGGCATC AGTCAACAGA GAGGGGCAGG GAAGGAGACA CCAAGCTCAC 3 720 

CCTTCGTCAT GGACCGAGGT TCCCACTCTG GGCAAAGCCC CTCACACTGC AAGGGATTGT 3780 

AGATAACACT GACTTGTTTG TTTTAACCAA TAACTAGCTT CTTATAATGA TTTTTTTACT 3 840 

AATGATACTT ACAAGTTTCT AGCTCTCACA GACATATAGA ATAAGGGTTT TTGCATAATA 3900 

AGCAGGTTGT TATTTAGGTT AACAATATTA ATTCAGGTTT TTTAGTTGGA AAAACAATTC 3 9 SO 

CTGTAACCTT CTATTTTCTA TAATTGTAGT AATTGCTCTA CAGATAATGT CTATATATTG 4 020 

GCCAAACTGG TGCATGACAA GTACTGTATT TTTTTATACC TAAATAAAGA AAAATCTTTA 4080 
GCCTGGGCAA CAAAAAAA 

Seq ID No: 105 Protein sequence: 
Protein Accession #: NP_001786.1 



1 11 21 31 41 51 

I I I I I I 

MQRLMMLLAT SGACLGLLAV AAVAAAGANP AQRDTHSLLP THRRQKRDWI WNQMHIDEEK SO 

NTSLPHHVGK IKSSVSRKNA KYLLKGEYVG KVFRVDAETG DVFAIERLDR ENISEYHLTA 120 

VIVDKDTGEN LETPSSFTIK VHDVNDNWPV FTHRLFNASV PESSAVGTSV ISVTAVDADD 180 

PTVGDHASVM YQILKGKEYF AIDNSGRIIT ITKSLDREKQ ARYEIWEAR DAQGliRGDSG 240 

TATVLVTLQD INDNFPFFTQ TKYTFWPED TRVGTSVGSL FVEDPDEPQN RMTKYSILRG 3 00 

DYQDAFTIET NPAHNEGIIK PMKPLDYEYI QQYSFIVEAT DPTIDLRYMS PPAGNRAQVI 3S0 

INITDVDEPP IFQQPFYHFQ LKENQKKPLI GTVLAMDPDA ARHSIGYSIR RTSDKGQFFR 420 

VTKKGD1YNE KELDREVYPW YNLTVEAKEL DSTGTPTGKE SIVQVHIEVL DENDNAPEFA 480 

KPYQPKVCEN AVHGQLVLQI SAIDKDITPR NVKFKFTLNT ENNFTLTDNH DNTANITVKY 540 

GQFDREHTKV HFLPWISDN GMPSRTGTST LTVAVCKCNE QGEFTFCEDM AAQVGVSIQA SD0 

WAILLCIDT ITVITLLIFL RRRLRKQARA HGKSVPEIHE QLVTYDEEGG GEMDTTSYDV 6S0 

SVLNSVRRGG AKPPRPALDA RPSLYAQVQK PPRHAPGAHG GPGEMAAMIE VKKDEADHDG 720 

DGPPYDTLHI YGYEGSESIA ESLSSLGTDS SDSDVDYDFL NDWGPRFKML AELYGSDPRE 780 

Seq ID NO: 106 DMA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 1-474 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

ACAGTACTCT GTGCAAAAAA CCTGGTGAAA AAGGATTTTT TCCGACTTCC TGATCCATTT SO 

GCTAAGGTGG TGGTTGATGG ATCTGGGCAA TGCCATTCTA CAGATACTGT GAAGAATACG 120 

CTTGATCCAA AGTGGAATCA GCATTATGAC CTGTATATTG GAAAGTCTGA TTCAGTTACG 180 

ATCAGTGTAT GGAATCACAA GAAGATCCAT AAGAAACAAG GTGCTGGATT TCTCGGTTGT 240 

GTTCGTCTTC TTTCCAATGC CATCAACCGC CTCAAAGACA CTGGTTATCA GAGGTTGGAT 300 

TTATGCAAAC TCGGGCCAAA TGACAATGAT ACAGTTAGAG GACAGATAGT AGTAAGTCTT 3 SO 

CAGTCCAGAG ACCGAATAGG CACAGGAGGA CAAGTTGTGG ACTGCAGTCG TTTATTTGAT 420 

AACGATTTAC CAGACGGAGC TCATTATTTG TGGACTTGGA AAGATAGATG TTAATGACTG 4 B0 

GAAGGTAAAC ACCCGGTTAA AACACTGTAC ACCAGAC AG C AACATTGTCA AATGGTTCTG 540 

GAAAGCTGTG GAGTTTTTTG ATGAAGAGCG ACGAGCAAGA TTGCTTCAGT TTGTGACAGG 600 

ATCCTCTCGA GTGCCTCTGC AGGGCTTCAA AGCATTGCAA GGTGCIGCAG GCCCGAGACT 660 

CTTTACCATA CACCAGATTG ATGCCTGCAC TAACAACCTG CCGAAAGCCC ACACTTGCTT 720 

CAATCGAATA GACATTCCAC CCTATGAAAG CTATGAAAAG CTATATGAAA AGCTGCTAAC 780 

AGCCATTGAA GAAACATGTG GATTTGCTGT GGAATGACAA GCTTCAAGGA TTTACCCAGG 840 
AC 

Seq ID No: 107 Protein sequence: 
Protein Accession #: none found 



I 1 I I I I 

TVLCAKNLVK KDFFRLPDPF AKVWDGSGQ CHSTDTVKNT LDPKWNQHYD LYIGKEDSVT 60 

ISVWNHKKIH KKQGAGFLGC VRLLSNAINR LKDTGYQRLD LCKLGPNDKD TVRGQIWSL 120 

QSRDRIGTGG QWDCSRLFD NDLPDGAHYL WTWKDRC 

Seq ID NO: 108 DNA sequence 

Nucleic Acid Accession #: NM_002318.1 

Coding sequence: 248-2572 (underlined sequences correspond to start and stop codons) 
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J ACTCCAGCGC GCGGCTACCT ACGCTTGGTG CTTGCTTTCT CCAGCCATCG GAGACCAGAG 60 

CCGCCCCCTC TGCTCGAGAA AGGGGCTCAG CGGCGGCGGA AGCGGAGGGG GACCACCGTG 120 

GAGAGCGCGG TCCCAGCCCG GCCACTGCGG ATCCCTGAAA CCAAAAAGCT CCTGCTGCTT 180 

CTGTACCCCG CCTGTCCCTC CCAGCTGCGC AGGGCCCCTT CGTGGGATCA TCAGCCCGAA 240 

GACAGGGATG GAGAGGCCTC TGTGCTCCCA CCTCTGCAGC TGCCTGGCTA TGCTGGCCCT 300 

10 CCTGTCCCCC CTGAGCCTGG CACAGTATGA CAGCTGGCCC CATTACCCCG AGTACTTCCA 360 

GCAACCGGCT CCTGAGTATC ACCAGCCCCA GGCCCCCGCC AACGTGGCCA AGATTCAGCT 420 

GCGCCTGGCT GGGCAGAAGA GGAAGCACAG CGAGGGCCGG GTGGAGGTGT ACTATGATGG 4 80 

CCAGTGGGGC ACCGTGTGCG ATGACGACTT CTCCATCCAC GCTGCCCACG TCGTCTGCCG 540 

GGAGCTGGGC TATGTGGAGG CCAAGTCCTG GACTGCCAGC TCCTCCTACG GCAAGGGAGA SOO 

15 AGGGCCCATC TGGTTAGACA ATCTCCACTG TACTGGCAAC GAGGCGACCC TTGCAGCATG 660 

CACCTCCAAT GGCTGGGGCG TCACTGAGTG CAAGCACACG GAGGATGTCG GTGTGGTGTG 720 

CAGCGACAAA AGGATTCCTG GGTTCAAATT TGACAATTCG TTGATCAACC AGATAGAGAA 780 

CCTGAATATC CAGGTGGAGG ACATTCGGAT TCGAGCCATC CTCTCAACCT ACCGCAAGCG 840 

CACCCCAGTG ATGGAGGGCT ACGTGGAGGT GAAGGAGGGC AAGACCTGGA AGCAGATCTG 900 

20 TGACAAGCAC TGGACGGCCA AGAATTCCCG CGTGGTCTGC GGCATGTXTG GCTTCCCTGG 960 

GGAGAGGACA TACAATACCA AAGTGTACAA AATGTTTGCC TCACGGAGGA AGCAGCGCTA 1020 

CTGGCCATTC TCCATGGACT GCACCGGCAC AGAGGCCCAC ATCTCCAGCT GCAAGCTGGG 1080 

CCCCCAGGTG TCACTGGACC CCATGAAGAA TGTCACCTGC GAGAATGGGC TGCCGGCCGT 1140 

GGTGAGTTGT GTGCCTGGGC AGGTCTTCAG CCCTGACGGA CCCTCGAGAT TCCGGAAAGC 1200 

25 ATACAAGCCA GAGCAACCCC TGGTGCGACT GAGAGGCGGT GCCTACATCG GGGAGGGCCG 12S0 

CGTGGAGGTG CTCAAAAATG GAGAATGGGG GACCGTCTGC GACGACAAGT GGGACCTGGT 1320 

GTCGGCCAGT GTGGTCTGCA GAGAGCTGGG CTTTGGGAGT GCCAAAGAGG CAGTCACTGG 13 80 

CTCCCGACTG GGGCAAGGGA TCGGACCCAT CCACCTCAAC GAGATCCAGT GCACAGGCAA 1440 

TGAGAAGTCC ATTATAGACT GCAAGTTCAA TGCCGAGTCT CAGGGCTGCA ACCACGAGGA 1500 

30 GGATGCTGGT GTGAGATGCA ACACCCCTGC CATGGGCTTG CAGAAGAAGC TGCGCCTGAA 1560 

CGGCGGCCGC AATCCCTACG AGGGCCGAGT GGAGGTGCTG GTGGAGAGAA ACGGGTCCCT 1620 

TGTGTGGGGG ATGGTGTGTG GCCAAAACTG GGGCATCGTG GAGGCCATGG TGGTCTGCCG 16 80 

CCAGCTGGGC CTGGGATTCG CCAGCAACGC CTTCCAGGAG ACCTGGTATT GGCACGGAGA 1740 

TGTCAACAGC AACAAAGTGG TCATGAGTGG AGTGAAGTOC TCGGGAACGG AGCTGTCCCT 1800 

35 GGCGCACTGC CGCCACGACG GGGAGGAC3T GGCCTGCCCC CAGGGCGGAG TGCAGTACGG 1860 

GGCCGGAGTT GCCTGCTCAG AAACCGCCCC TGACCTGGTC CTCAATGCGG AGATGGTGCA 1920 

GCAGACCACC TACCTGGAGG ACCGGCCCAT GTTCATGCTG CAGTGTGCCA TGGAGGAGAA 1980 

CTGCCTCTCG GCCTCAGCCG CGCAGACCGA CCCCACCACG GGCTACCGCC GGCTCCTGCG 2040 

CTTCTCCTCC CAGATCCACA ACAATGGCCA GTCCGACTTC CGGCCCAAGA ACGGCCGCCA 2100 

40 CGCGTGGATC TGGCACGACT GTCACAGGCA CTACCACAGC ATGGAGGTGT TCACCCACTA 2160 

TGACCTGCTG AACCTCAATG GCACCAAGGT GGCAGAGGGC CACAAGGCCA GCTTCTGCTT 2220 

GGAGGACACA GAATGTGAAG GAGACATCCA GAAGAATTAC GAGTGTGCCA ACTTCGC-CGA 22 80 

TCAGGGCATC ACCATGGGCT GCTGGGACAT GTACCGCCAT GACATCGACT GCCAGTGGGT 2340 

TGACATCACT GACGTGCCCC CTGGAGACTA CCTGTTCCAG GTTGTTATTA ACCCCAACTT 24 00 

45 CGAGGTTGCA GAATCCGATT ACTCCAACAA CATCATGAAA TGCAGGAGCC GCTATGACGG 2460 

CCACCGCATC TGGATGTACA ACTGCCACAT AGGTGGTTCC TTCAGCGAAG AGACGGAAAA 2520 

AAAGTTTGAG CACTTCAGCG GGCTCTTAAA CAACCAGCTG TCCCCGCAG T AAA GAAGCCT 2580 

GCGTGGTCAA CTCCTGTCTT CAGGCCACAC CACATCTTCC ATGGGACTTC CCCCCAACAA 2640 

CTGAGTCTGA ACGAATGCCA CGTGCCCTCA CCCAGCCCGG CCCCCACCCT GTCCAGACCC 2700 

50 CTACAGCTGT GTCTAAGCTC AGGAGGAAAG GGACCCTCCC ATCATTCATG GGGGGCTGCT 2 760 

ACCTGACCCT TGGGGCCTGA GAAGGCCTTG GGGGGGTGGG GTTTGTCCAC AGAGCTGCTG 2 820 

GAGCAGCACC AAGAGCCAGT CTTGACCGGG ATGAGGCCCA CAGACAGGTT GTCATCAGCT 2880 

TGTCCCATTC AAGCCACCGA GCTCACCACA GACACAGTGG AGCCGCGCTC TTCTCCAGTG 2940 

ACACGTGGAC AAATGCGGGC TCATCAGCCC CCCCAGAGAG GGTCAGGCCG AACCCCATTT 3000 

55 CTCCTCCTCT TAGGTCATTT TCAGCAAACT TGAATATCTA GACCCCTCTT CCAATGAAAC 3 060 

CCTCCAGTCT ATTATAGTCA CATAGATAAT GGTGCCACGT GTTTTCTGAT TTGGTGAGCT 3120 

CAGACTTGGT GCTTCCCTCT CCACAACCCC CACCCCTTGT TTTTCAAGAT ACTATTATTA 3180 

TATTTTCACA GACTTTTGAA GCACAAATTT ATTGGCATTT AATATTGGAC ATCTGGGCCC 3240 

TTGGAAGTAC AAATCTAAGG AAAAACCAAC CCACTGTGTA AGTGACTCAT CTTCCTGTTG 33 00 

60 TTCCAATTCT GTGGGTTTTT GATTCAACGG TGCTATAACC AGGGTCCTGG GTGACAGGGC 3360 

GCTCACTGAG CACCATGTGT CATCACAGAC ACTTACACAT ACTTGAAACT TGGAATAAAA 3420 
GAAAGATTTA TG 

Seq ID No: 109 Protein sequence: 
65 Protein Accession #: NP_002309.1 



1 11 21 31 41 51 

_ n I I I 1 I I 

70 MERPLCSHLC SCLAMLALIiS PLSLAQYDSW PHYPEYFQQP APEYHQPQAP ANVAKIQLRL 60 

AGQKRKHSEG RVEVYYDGQW GTVCDDDFSI HAAHWCREL GYVEAKSKTA SSSYGKGEGP 120 

IWLDNtiHCTG NEATLAACTS NGWGVTDCKH TEDVGWCSD KRIPGPKEDN SLINQIENLN 180 

IQVEDIRIRA ILSTYRKRTP VMEGYVEVKE GKTWKQICDK HWTAKNSRW CGMFGFPGER 240 

TYNTKVYKMF ASRRKQRYWP FSMDCTGTEA HISSCKLGPQ VSLDPMKNVT CENGLPAWS 3 00 

75 CVPGQVFSPD GPSRFRKAYK PEQPLVRLRG GAYIGEGRVE VLKNGEWGTV CDDKWDLVSA 360 

SWCRELGFG SAKEAVTGSR LGQGIGPIHL NEIQCTGNEK SIIDCKFNAE SQGCNHEEDA 420 
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GVRCNTPAMG LQKKLRENGG RNPYEGRVEV LVERNGSLVW GMVCGQNWGI VEAMWCRQL 48 0 

GLGFASNAFQ ETWYWHGDVN SNKWMSGVK CSGTELSLAH CRHDGEDVAC PQGGVQYGAG 540 

VACSETAPDL VLNAEMVQQT TYLEDRPMFM LQCAMEENCL SASAAQTDPT TGYRRLLRFS SOO 

SQIHNNGQSD FRPKNGRHAW IWHDCHRHYH SMEVFTHYDL LNLNGTKVAE GHKASFCLED 660 

TECEGDIQKN YECANFGDQG ITMGCWDMYR HDIDCQWVDI TDVPPGDYLF QWINPNFEV 72 0 
AESDYSNNIM KCRSRYDGHR IWMYNCHIGG SFSEETEKKF EHFSGLLNNQ LSPQ 

Seq ID NO: 110 DNA sequence 

Nucleic Acid Accession #: none found, CAT_73007_3 

Coding sequence: 1-495 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CGGACGCGTG GGTCGACCCA CGCGTCCGCC CACGCGTCCG TATGGACAGA GCCTCCACTG 60 

GCTGCTGCCT GCCCGCCACA TACCCAGCTG ACATGGGCAC CGCAGGAGCC ATGCAGCTGT 120 

CTGGGTGATC CTGGGCTTCC TCCTGTTCCG AGGCCACAAC TCCCAGCCCA CAATGACCCA 18 0 

ACCTCTAGCT CTCAGGGAGG CCTTGGCGGT CTAAGTCTGA CCACAGAGCC AGTTTCTTCC 24 0 

ACCCAGGATA CATCCCTTCC TCAGAGGCTA ACAGGCCAAG CCATCTGTCC AGCACTGGTA 300 

CCCAGGCGCA GGTGTCCCCA GCAGTGGAAG AGACGGAGGC ACAAGCAGAG ACACATTTCA 360 

ACTGTTCCCC CCAATTCAAC CACCATGAGC CTGAGCATGA GGGAAGATGC GACCATCCTG 420 

CCAGCCCCAC GTCAGAGACT GTGCTCACTG TGGCTGCATT TGGGATGGAG TCGGGTGGAG 48 0 

GCCCACTCTG GCTAGGGGGC GGCAGGCTGA GAGCTCACCT GTTCAGCAGA GAAGTGGAAC 54 0 

CACTTTGCTC CTGGAGCCTG TCTACCACAG TGTTATCAGC TTCATTGTCA TCCTGGTGGT 60 0 

GTGGTGATCA TCCTAGTTGG TGTGGTCAGC CTGAGGGTTC AGTGTCGGAA GAG C AAGGAG 66 0 

TCTGAAGATC CCAGAACCTG GGAGTACAGG GCGTGTCTGA CAAGCTGGTC ACAGACCATG 720 

GCGAGAACGA CAGCATCGCC CATTATCACA TGGAAGACAT CACACGACTT AGGGCAACAC 780 

GCACTCAGCA GCGAGCATCA AAGGAGCCTA CGCATGGCCC AGACTGAGAG CAAGCACAAA 84 0 
GGGC 

Seq ID No: 111 Protein sequence: 

Protein Accession #: none found, CAT_73007_3 



1 11 21 31 41 51 

I I I I ! I 

RTRGSTHASA HASVWTEPPL AAACPPHTQL TWAPQEPCSC LGDPGLPPV? RPQLPAHHDP 6 0 

TSSSQGGLGG LSLTTEPVSS TQDTSLPQRL TGQAICPALV PRRRCPQQWX RRRHKQRHIS 12 0 
TVPPNSTTMS LSMREDATIL PAPRQRLCSL WLHLGWSRVE AHSG 



Seq ID NO: 112 DNA sequence 

Nucleic Acid Accession #: NM_005424.1 

Coding sequence: 37-3453 (underlined sequences correspond t 



: and stop codons) 



CGCTCGTCCT 
TTGCTCCCCA 
GCCAACCTGC 



CCAGACAAGG 
CACAAGGAGA 
GACTGGCATG 
TCGAGCGGCA 
CGGCTCATCG 
CCAGGTTGCC 
GGCTTCACTG 
CAGGAGCAGT 
TATGGCTGCT 
GGTCATTTTG 
CGGTTCAGTG 
CGGATCCCCC 
CGGATCAACT 



TCACACACAC 



AAGCCCAGGA 
TCTACAGTGC 
TGCGGGGTTG 
TACATGGAGG 
GCACCCGCTG 
GCCCAGGCAT 



GCTGAGTTCG 
TCCACATCTG 
CCCCTGGCTG 
GTCTCGTTCT 
AGTACCATGG 
CTGAGGCCAA 
GAGGGGGCCT 
CCGTGGTTGG 
CCCTTGGTGC 



GGGCTGATTG 
GTTGTGTCTG 
AGATCCTCAA 
GTGCAGCTGC 
GCACTGTGCT 
AGGTGCCCCG 
GCGGCCAAGA 
CACCTCGGCT 
CTGGGGATGG 
ACTGGTCGAC 
AGACAGGATA 
GGGGGCCTCC 



CCGGGCCACT 



21 

1 

GGTCGGCCTC 
GGCTTCTCAT 
CCCCCAGCGC 
CGCCTGGGGC 
GCCACCCCTG 
GCCCTCGGAC 
CGTCATCTAC 
TGTGAACAAA 
CGTGATCTGG 
TGGGCGGTTC 
CACTTACCTG 
TGGGGCTGGG 
TGTCTGCCAC 
TGAACAGGCC 
ATCAGGCTGC 
TGGCTGGAGA 
CCGACTCCAG 
CCCCTCTGGG 
CATGGCCTCA 
AGGGAACCCC 
CCTGTCCACC 
CTTGGTTCTT 
CAGCCGGCGC 
CCTGACCAAG 
ACCCATCTCC 
CATTGTGGTG 
CAGTGTTCGT 
CACCCTCATG 
TGTGGAAGGC 
GGTGGGCGAC 



TTCTTCCTGA 
CCGCCCCTGC 
CGCCTGGCGC 
CTCGTGGGCG 
GTGCACAACA 
GGTGACACCG 
AAGAGCAACG 
CTGCTGCAGC 
GAAGCCAGCC 
CGCTGGGGGC 
GACCATGACG 



TCTGGCGGGT 
CGGTGGACCT 
CTTGCGTGTC 



GCAACGGTTC 
TCTTCTCCTG 
GCCCTGGAGC 
CTGTACTTTC 
GATCCTACTT 
TCCCAAATGT 
CCCTGGGCAG 
CAGGCIGTAC 



CGGGGCCTCA 
GGAAGCCAGT 
TGCCAGTGTC 
TGGCATGGAG 
GAACTGGAGT 
TTCCCCGTGC 
AAGGCCATTG 
GCGGACAGTG 
TTCAAGGTCA 
CAGAGCCGCC 
ACTGTCCGCC 
GACCCCAGTG 
GTGCAGCTGA 
ACCACAGACT 
ACTGACCGGC 
GGTTTCCTGC 



CCTTCTGCCT 
GCCAAGAAGC 
AGAATGC-TGG 
TGCACTGTGA 
TCAACTTAGA 
GGGGCAGCAT 



GGTTCTGGGA 
ATGTGAAAGT 
AGCTTGTGGT 
TGCACTACCG 
AGAACGTGAC 



TGGGGAGGCC 18 0 

GGACGACCGT 24 0 

GCACCAGGTC 300 

CGTGGGCGGT 36 0 

CCACCTGCTT 42 0 

TGCACGTGTG 480 

CTACACCCTG 54 0 

GCAGCCACCA 600 

CAAGGAGTGC 72 0 

ATGCCCCCCT 780 

GCAGAGCTGC 840 

CCCAGACCCC 90 0 

TTGTGCCCCT 960 

CACTTGTGAC 1020 

GAAGTCAGAC 1080 

GACGATGCCC 114 0 

AGAGCTACGC 1200 

GAAGACCACA 1260 

GTGCCGTGTG 132 0 

GCCCCCCGTG 1380 

CTCCCCGCTG 1440 

GCCCCAGGAC 1500 

GTTAATGAAC 1560 

GGAAGGAGGA 1620 

TTTGTTGCAG 168 0 

CTGGTCCTTG 1740 

GGACGGGACA 1800 
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CGGGGGCAGG AGCGGCGGGA GAACGTCTCA 

GGACTCACGC CTGGCACCCA CTACCAGCTG 

GGCCCGGCCT CGCCCCCTGC ACACGTGCTT 

CACCTCCACG CCCAGGCCCT CTCAGACTCC 

GCTCTGCCTG GGCCAATATC CAAGTACGTT 

GACCCACTGT GGATAGACGT GGACAGGCCT 

AACGCCAGCA CGCGCTACCT CTTCCGCATG 

AGCAACACAG TAGAAGAGTC CACCCTGGGC 

GAGAGCCGGG CAGCTGAAGA GGGCCTGGAT 

GTGTCTGCCA CCTGCCTCAC CATCCTGGCC 

AGCTGCCTGC ATCGGAGACG CACCTTCACC 

CTGCAGTTCA GCTCAGGGAC CTTGACACTT 

CTGAGCTACC CAGTGCTAGA GTGGGAGGAC 

AACTTCGGCC AGGTCATCCG GGCCATGATC 

ATCAAAATGC TGAAAGAGTA TGCCTCTGAA 

GAAGTTCTGT GCAAATTGGG GCATCACCCC 

AACCGAGGTT ACTTGTATAT CGCTATTGAA 

CTGCGGAAAA GCCGGGTCCT AGAGACTGAC 

TCTACCCTTA GCTCCCGGCA GCTGCTGCGT 

TACCTGAGTG AGAAGCAGTT CATCCACAGG 

GAGAACCTAG CCTCCAAGAT TGCAGACTTG 

AAGAAGACGA TGGGGCGTCT CCCTGTGCGC 

GTCTATACCA CCAAGAGTGA TGTCTGGTCC 

CTTGGAGGTA CACCCTACTG TGGCATGACC 

GGCTACCGCA TGGAGCAGCC TCGAAACTGT 

TGCTGGCGGG ACCGTCCCTA TGAGCGACCC 

CGCATGCTGG AAGCCAGGAA GGCCTATGTG 

GCGGGCATTG ATGCCACAGC TGAGGAGGCC 

GCTGGCCGGA GCAAACTCTG CTGTCTAACC 

CTTAAGCTGC CTCAAGGAAT TTTTTTAACT 

GGTGGGCTTA GGGGAACTGG GTTCCCATGC 

TCCTTCTTTC TAGTTCAGCT GCCCCACAGG 

CAAACCCCCA CTCCAGCTCC TTCGCTTAAG 

CAGCTACTCC CACTCCCGGC CTGTCATTCA 
AAAAA 

Seq ID No: 113 Protein sequence: 
Protein Accession #: NP_005415.1 

1 11 21 

I I I 

MVWRVPPFLL PILFLASHVG AAVDLTLLAN 

LLLEKDDRIV RTPPGPPLRL ARNGSHQVTL 

NSPGAHLLPD KVTHTVNKGD TAVLSARVHK 

QLPNVQPPSS GIYSATYLEA SPLGSAFFRL 

DGECVCPPGF TGTRCEQACR EGRFGQSCQE 

QCQEACAPGH FGADCRLQCQ CQNGGTCDRF 

EFNLETMPRI NCAAAGNPFP VRGSIELRKP 

SGFWECRVST SGGQDSRRFK VMVKVPPVPL 

RLHYRPQDST MDWSTIWDP SENVTLMNLR 

DCPEPLLQPW LEGWHVEGTD RLRVSWSLPL 

QARTALLTGL TPGTHYQLDV QLYHCTLLGP 

QLTWKHPEAL PGPISKYWE VQVAGGAGDP 

SIQGLGDWSN TVEESTLGNG LQAEGEVQES 

LTLVCIRRSC LHRRRTFTYQ SGSGEETILQ 

FEDIiIGEGNF GQVIRAMIKK DGLKMNAAIK 

INLLGACKNR GYLYIAIEYA PYGNLLDFLR 

SDAANGMQYL SEKQFIHRDL AARNVLVGEN 

AIESLNYSVY TTKSDVWSFG VLMJEIVSLG 

EVYELMRQCW RDRPYERPPF AQIALQLGRM 



TCCCCCCAGG CCCGCACTGC CCTCCTGACG 1860 

GATGTGCAGC TCTACCACTG CACCCTCCTG 192 0 

CTGCCCCCCA GTGGGCCTCC AGCCCCCCGA 1980 

GAGATCCAGC TGACATGGAA GCACCCGGAG 2 04 0 

GTGGAGGTGC AGGTGGCTGG GGGTGCAGGA 2100 

GAGGAGACAA GCACCATCAT CCGTGGCCTC 21S0 

CGGGCCAGCA TTCAGGGGCT CGGGGACTGG 2220 

AACGGGCTGC AGGCTGAGGG CCCAGTCCAA 2280 

CAGCAGCTGA TCCTGGCGGT GGTGGGCTCC 234 0 

GCCCTTTTAA CCCTGGTGTG CATCCGCAGA 2400 

TACCAGTCAG GCTCGGGCGA GGAGACCATC 2460 

ACCCGGCGGC CAAAACTGCA GCCCGAGCCC 2520 

ATCACCTTTG AGGACCTCAT CGGGGAGGGG 2580 

AAGAAGGACG GGCTGAAGA? GAACGCAGCC 2640 

AATGACCATC GTGACTTTGC GGGAGAACTG 2700 

AACATCATCA ACCTCCTGGG GGCCTGTAAG 2760 

TATGCCCCCT ACGGGAACCT GCTAGATTTT 2 82 0 

CCAGCTTTTG CTCGAGAGCA TGGGACAGCC 2880 

TTCGCCAGTG ATGCGGCCAA TGGCATGCAG 294 0 

GACCTGGCTG CCCGGAATGT GCTGGTCGGA 3 00 0 

GGCCTTTCTC GGGGAGAGGA GGTTTATGTG 3 060 

TGGATGGCCA TTGAGTCCCT GAACTACAGT 3120 

TTTGGAGTCC TTCTTTGGGA GATAGTGAGC 318 0 

TGTGCCGAGC TCTATGAAAA GCTGCCCCAG 324 0 

GACGATGAAG TGTACGAGCT GATGCGTCAG 33 0 0 

CCCTTTGCCC AGATTGCGCT ACAGCTAGGC 3360 

AACATGTCGC TGTTTGAGAA CTTCACTTAC 342 0 

TGAGCTGCCA TCCAGCCAGA ACGTGGCTCT 34 80 

TGTGACCAGT CTGACCCTTA CAGCCTCTGA 354 0 

TAAGGGAGAA AAAAAGGGAT CTGGGGATGG 3 600 

TTTGTAGGTG TCTCATAGCT ATCCTGGGCA 366 0 

TGTGTTTCCC ATCCCACTGC TCCCCCAACA 372 0 

CCAGCACTCA CACCACTAAC ATGCCCTGTT 378 0 

GAAAAAAATA AATGTTCTAA TAAGCTCCAA 3 84 0 



31 41 51 

I I i 

LRLTDPQRFF LTCVSGEAGA GRGSDAWGPP 6 0 

RGFSKPSDLV GVFSCVGGAG ARRTRVIYVH . 12 0 

EKQTDVIWKS NGSYFYTLDW HEAQDGRFLL 18 0 

IVRGCGAGRW GPGCTKECPG CLHGGVCHDH 24 0 

QCPGISGCRG LTFCLPDPYG CSCGSGWRGS 300 

SGCVCPSGWH GVHCEKSDRI PQILNMAEEL 360 

DGTVLLSTKA IVEPEKTTAE FEVPRLVLAD 42 0 

AAPRLLTKQS RQLWSPLVS FSGDGPISTV 4 80 

PKTGYSVRVQ LSRPGEGGEG AWGPPTLMTT 54 0 

VPGPLVGDGF LLRLWDGTRG QERRENVSSP 600 

ASPPAHVLLP PSGPPAPRHL HAQALSDSEI 660 

LWIDVDRPEE TSTIIRGLNA STRYLFRMRA 720 

RAAEEGLDQQ LILAWGSVS ATCLTILAAL 78 0 

FSSGTLTLTR RPKLQPEPLS YPVLEWEDIT 84 0 

MLKEYASEND HRDFAGSLEV LCKLGHHPNI 900 

KSRVLETDPA FAREHGTAST LSSRQLLRFA 960 

LASKIADFGL SRGEEVYVKK TMGRLPVRVJM 1020 

GTPYCGMTCA ELYEKLPQGY RMEQPRNCDD 10 8 0 

LEARKAYVNM SLFENFTYAG IDATAEEA 



Seq ID NO: 114 DNA sequence 

Nucleic Acid Accession #: NM_002632.1 

Coding sequence: 322-771 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGGATTCGGG CCGCCCAGCT ACGGGAGGAC CTGGAGTGGC ACIGGGCGCC CGACGGACCA 60 

TCCCCGGGAC CCGCCTGCCC CTCGGCGCCC CGCCCCGCCG GGCCGCTCCC CGTCGGGTTC 120 

CCCAGCCACA GCCTTACCTA CGGGCTCCTG ACTCCGCAAG GCTTCCAGAA GATGCTCGAA 180 

CCACCGGCCG GGGCCTCGGG GCAGCAGTGA GGGAGGCGTC CAGCCCCCCA CT C AGCTCTT 240 

CTCCTCCTGT GCCAGGGGCT CCCCGGGGGA TGAGCATGGT GGTTTTCCCT CGGAGCCCCC 300 

TGGCTCGGGA CGTCTGAGAA GATGCCGGTC ATGAGGCTGT TCCCTTGCTT CCTGCAGCTC 360 

CTGGCCGGGC TGGCGCTGCC TGCTGTGCCC CCCCAGCAGT GGGCCTTGTC TGCTGGGAAC 420 

GGCTCGTCAG AGGTGGAAGT GGTACCCTTC CAGGAAGTGT GGGGCCGCAG CTACTGCCGG 480 
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GCGCTGGAGA GGCTGGTGGA CGTCGTGTCC 

AGCCCATCCT GTGTCTCCCT GCTGCGCTGC 

TGTGTGCCGG TGGAGACGGC CAATGTCACC 

CGGCCCTCCT ACGTGGAGCT GACGTTCTCT 
5 CGGGAGAAGA TGAAGCCGGA AAGGTGCGGC 

TGGAGGAGAG AGACCCCGCA CCCGGCTCGT 

TCCTGCTGGT ACCTGCCCTC TATTTATTAG 

CCTTCAAGAC GAGGGGCAGG GAAGGACAGG 

TGAGAGAAAG AGAGAAGCCA GCCACAGACC 
10 ACACGTGGCC TCGTGAGGGG CAAGCTAGGC 

GCAGAAGGAA AGAAGGGGGC CCTGCTACCT 

GCAGCCCTTG CTTTCGGAGC TCCTGTCCAA 

ACGGCCTGGT GGTGGGAAGG CCGGCAGCGG 

CTTCTGAAGA TCAGAACATT CAGCTCTGGA 
15 TCCTTGTCCC CCGTGATCTC CCCTCACACT 

TTTCCGGCCG AGGTGCCACC ACCCTGCCCC 

GGCTGGAGAA AGAGCTGCCT GGATGAGAAA 

GGGAGGAGCC TGTGCGTCCC AGCTGAAGGC 

CTGGCACCCC CACAAGCTGT CCCTGCAGGG 
20 ATAAAGTATT CTAGTGTGGA AACGC , 

Seq ID No: 115 Protein sequence : 

25 i ii 2i 

I I I 

MPVMRLFPCF LQLLAGLALP AVPPQQWALS 
WSEYPSEVE HMFSPSCVSL LRCTGCCGDE 
TFSQHVRCEC RPLREKMKPE RCGDAVPRR 

30 

Seq ID NO: 116 DNA sequence 
Nucleic Acid Accession #: NM_0073S1.1 

Coding sequence: 1-4131 (underlined sequences correspond to start and stop codons) 

35 1 11 21 31 41 51 

I I I I I I 

ATGGAGGGGG ACCGGGTGGC CGGGCGGCCG GTGCTGTCGT CGTTACCAGT GCTACTGCTG SO 

CTGCAGTTGC TAATGTTGCG GGCCGCGGCG CTGCACCCAG ACGAGCTCTT CCCACACGGG 120 

GAGTCGTGGT GGGACCAGCT CCTGCAGGAA GGCGACGACG TAAAGCTCAG CCGTGGTGAA 180 

40 GCTGGCGAAT CCCCTGCACT TCTTACGAAG CCCGATTCAG CAACCTCTAC GTGGGCACCA 240 

ACGGCATCAT CTCCACTCAG GACTTCCCCA GGGAAACGCA GTATGTGGAC TATGATTTCC 300 

CCACCGACTT CCCGGCCATC GCCCCTTTTC TGGCGGACAT CGACACGAGC CACGGCAGAG 3 SO 

GCCGAGTCCT GTACCGAGAG GACACCTCCC CCGCAGTGCT GGGCCTGGCC GCCCGCTATG 420 

TGCGCGCTGG CTTCCCGCGC TCTGCGCGCT TTTTACCCCC ACCCACGCCT TCCTGGCCAC 4 80 

45 CTGGGAGCAG GTAGGCGCTT ACGAGGAGGT CAAACGCGGG CGCTGCCCTC GGGAGAGCTG 540 

AACACTTTCC AGGCAGTTTT GGCATCTGAT GGGTCTGATA GCTACGCCCT CTTTCTTTAT 600 

CCTGCCAACG GCCTGCAGTT CCTTGGAACC CGCCCCAAAG AGTCTTACAA TGTCCAGCTT 660 

CAGCTTCCAG CTCGGGTGGG CTTCTGCCGA GGGGAGGCTG ATGATCTGAA GTCAGAAGGA 720 

CCATATTTCA GCTTGACTAG CACTGAACAG TCTGTGAAAA ATCTCTATCA ACTAAGCAAC 780 

50 CTGGGGATCC CTGGAGTGTG GGCTTTCCAT ATCGGCAGCA CTTCCCCGTT GGACAATGTC 840 

AGGCCAGCTG CAGTTGGAGA CCTTTCCGCT GCCCACTCTT CTGTTCCCCT GGGACGTTCC 900 

TTCAGCCATG CTACAGCCCT GGAAAGTGAC TATAATGAGG ACAATTTGGA TTACTACGAT 9S0 

GTGAATGAGG AGGAAGCTGA ATACCTTCCG GGTGAACCAG AGGAGGCATT GAATGGCCAC 1020 

AGCAGCATTG ATGTTTCCTT CCAATCCAAA GTGGATACAA AGCCTTTAGA GGAATCTTCC 1080 

55 ACCTTGGATC CTCACACCAA AGAAGGAACA TGTCTGGGAG AGGTAGGGGG CCCAGATTTA 1140 

AAAGGCCAAG TTGAGCCCTG GGATGAGAGA GAGACCAGAA GCCCAGCTCC ACCAGAGGTA 1200 

GACAGAGATT CACTGGCTCC TTCCTGGGAA ACCCCACCAC CGTACCCCGA AAACGGAAGC 12 SO 

ATCCAGCCCT ACCCAGATGG AGGGCCAGTG CCTTCGGAAA TGGATGTTCC CCCAGCTCAT 1320 

CCTGAAGAAG AAATTGTTCT TCGAAGTTAC CCTGCTTCAG GTCACACTAC ACCCTTAAGT 13 80 

60 CGAGGGACGT ATGAGGTGGG ACTGGAAGAC AACATAGGTT CCAACACCGA GGTCTTCACG 1440 

TATAATGCTG CCAACAAGGA AACCTGTGAA CACAACCACA GACAATGCTC CCGGCATGCC 1500 

TTCTGCACGG ACTATGCCAC TGGCTTCTGC TGCCACTGCC AAXCCAAGTT TTATGGAAAT 15 SO 

GGGAAGCACT GTCTGCCTGA GGGGGCACCT CACCGAGTGA ATGGGAAAGT GAGTGGCCAC 1620 

CTCCACGTGG GCCATACACC CGTGCACTTC ACTGATGTGG ACCTGCATGC GTATATCGTG 1680 

65 GGCAATGATG GCAGAGCCTA CACGGCCATC AGCCACATCC CACAGCCAGC AGCCCAGGCC 1740 

CTCCTCCCCC TCACACCAAT TGGAGGCCTG TTTGGCTGGC TCTTTGCTTT AGAAAAACCT 1800 

GGCTCTGAGA ACGGCTTCAG CCTCGCAGGT GCTGCCTTTA CCCATGACAT GGAAGTTACA I860 

TTCTACCCGG GAGAGGAGAC GGTTCGTATC ACTCAAACTG CTGAGGGACT TGACCCAGAG 1920 

AACTACCTGA GCATTAAGAC CAACATTCAA GGCCAGGTGC CTTACGTCCC AGCAAATTTC 19 B0 

70 ACAGCCCACA TCTCTCCCTA CAAGGAGCTG TACCAGTACT CCGACTCCAC TGTGACCTCT 2040 

ACAAGTTCCA GAGACTACTC TCTGACTTTT GGTGCAATCA ACCAAACATG GTCCTACCGC 2100 

ATCCACCAGA ACATCACTTA CCAGGTGTGC AGGCACGCCC CCAGACACCC GTCCTTCCCC 2 ISO 

ACCACCCAGC AGCTGAACGT GGACCGGGTC TTTGCCTTGT ATAATGATGA AGAAAGAGTG 2220 

CTTAGATTTG CTGTGACCAA TCAAATTGGC CCGGTCAAAG AAGATTCAGA CCCCACTCCG 2280 

75 GTGAATCCTT GCTATGATGG GAGCCACATG TGTGACACAA CAGCACGGTG CCATCCAGGG 2340 

ACAGGTGTAG ATTACACCTG TGAGTGCGCA TCTGGGTACC AGGGAGATGG ACGGAACTGT 2400 



GAGTACCCCA GCGAGGTGGA GCACATGTTC 540 

ACCGGCTGCT GCGGCGATGA GAATCTGCAC 600 

ATGCAGCTCC TAAAGATCCG TTCTGGGGAC SS0 

CAGCACGTTC GCTGCGAATG CCGGCCTCTG 720 

GATGCTGTTC CCCGGAGG TA A CCCACCCCT 7B0 

GTATTTATTA CCGTCACACT CTTCAGTGAC 840 

CCAACTGTTT CCCTGCTGAA TGCCTCGCTC 900 

ACCCTCAGGA ATTCAGTGCC TTCAACAACG 960 

CCTGGGAGCT TCCGCTTTGA AAGAAGCAAG 1020 

CCCAGAGGCC CTGGAGGTCT CCAGGGGCCT 1080 

GTTCTTGGGC CTCAGGCTCT GCACAGACAA 1140 

AGTAGGGATG CGGATTCTGC TGGGGCCGCC 1200 

GCGGAGGGGA TTCAGCCACT TCCCCCTCTT 12 SO 

GAACAGTGGT TGCCTGGGGG CTTTTGCCAC 1320 

TTGCCATTTG CTTGTACTGG GACATTGTTC 13 80 

C AC TAAGAGA CACATACAGA GTGGGCCCCG 1440 

CAGCTCAGCC AGTGGGGATG AGGTCACCAG 1500 

AGTGGCAGGG GAGCAGGTTC CCCAAGGGCC 15 SO 

CCATCTGACT GCCAAGCCAG ATTCTCTTGA 1620 



I I I 

AGNGSSEVEV VPFQEVWGRS YCRALERLVD 
NLHCVPVETA NVTMQLLKIR SGDRPSYVEL 
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GTGGATGAAA 
AACTTGCCTG 
CATACTTGCA 



ATGAATGTGC 
GAAGCTACAG 
TCTTGATCAC 



GTGTGAGTGC 



TGTGAACAAC 
ATCCCCCAAT 
TTCTGCTGGT 
TCCACCCCGC 
GAGCGCTGGA 
GTGCCCCAGT 



ACCACCCCTG 
GATGTGACCC 
TACTTACCCC 
CATGGCTCCA 
GATGTTGCTG 
ATCGTGAATT 



CTGCAGCTAC 
ATGGGGATGG 
AGCAGCGCCA 
GCGACGAGCA 
GCGTGGACCC 
CTCACTGTGG 
GGGAAAACCT 



TGGGCACCAG 
CTGCTACAAT 
ATTTCAGTGC 
TGCCCAGGCC 
GGGCAACTTC 
TGATGGTCAT 



CATCGCTGTG 
CGGAGTGGTT 
AACCCCTGTG 
CATGGAGGCA 
TGCACTGATG 
ACTCCTGGTT 
ATACCTGACT 
CAGTATGCCT 



GCCCCAACTC TGTATGTATC 
ATGAGTTTGC 
AGGATGGCAG 1 
GCACGTTCAG C 
TAGATGAATG C 
CCTTCTCCTG C 
CCACCTCAAG CCTGACACCC 2820 
ACCCTGGGGC 



3 TCATACCTGT 
3 CTGTGCCTGC 
3 CTCAGAAAAC 



GACGGACAAT 
CAGGTCTGAT 
CGGACAGTGT 
TCTTCTACAC 
TGTACTGGAC 



GCTGGAGCAC 
GGGCCACTTC 
AGATGGCAGA 
CACCGTCGCT 
GGGCACCTTC 
CAGGCTTCAG 
AATTGATTAC 
CAGCCGTGCC 
AAGCCCTGAA 
CCTGGATAAG 
AGATCTGGTG 
AGACTGGAAT 



GAAGTTCCTG 
GAGCCCACCC 
TACGGTGGCA 
ATCCCCCTGC 



GTGCTCTATA 
AAGGATGCAG 
GACTGCCGGG 



CCCCCCGAGA 
AGTGCCACGG 
GCACCCGCTC 
TCCGGCCCAC 
CTCAGGGCCA 
CTAAGACCCT 



CGCAAGGTCC 
CGAGGCAACT 
TTAGATGGAG 
ACCTTTGACC CTTTCTCTAA ACTGCTCTGC 
TGTACACTAC 
AGCATCGTAA 
T C AGTAAAT A 



AGGCAAAGAA 
TTTTGTAGTG 
TTAAAAATGA 
TTAAAGCAAC 
AAGAATTAGA 
ATTAAAGCAC 
AATGCAGAAC 
TTTTTACCTC 
TGCATTTTCT 
TTATAAATAC 
TAAGTTTAAA 



CTGATGGAAC 
GCTATGCAGA 
AACATAGTGG 
TAACTGCAGT 
GACTTGGAGT 
AGTAAAAAAG 
CAAAAAGACT 
AGGTTGTTAT 
TTCTTGTAAA 
GTTTAAGACT 



GGACTTGCCA 
ATAGAGAGCG 
AATCCCCGTG 
AGAGAAGCTC 
AATACAGACA 
TGGGCAGATG 
GTCATTCAAA 



TGGGAGCAGA 
TAGACCACAT 
CCCTGCTGGA 
CCATCGCTGT 
CTAAAATTGA 
TTGGATTGCC 
CAGGAACCAA 
ACAACCTCAA 



TCACTTCTAC 
CCAGTTTACT 
CTACCCCTAC 
TTACAATCAG 
GAATTGGCCA 
TTTGTGAAAA 
TGCAAGTTTA 



TGATGGGAAA 
TGCATCAGTG 
GATCCTGTAG 
CAAAGATGTA 
AAAAAAAAAA 



CTAAACCTGA 
TATAACAATA 
GGACCAATTA 
CATTTCTATT 
TGCCTCTATA 
ACAATTTTAA 
AAAAAAAAA 



GATGAGTATC 
TGCCCAACAG 
AACCTGGACC 
TTAGACGTTC 
GCTGATACCT 
AAAAGGTAAC 

TTTTTGCCAT 
TAATCCTAAA 
TTTATAGTTT 
TATATCAAAA 
GAAGTACCCA 
AATTTTCTAG 



TCCCAGAACA ACGATCTCAC 



CTAAAGAACA 
CTGAGCATCC 
CAATCTTTAC 
AGAATTTTAA 
AAGATCAAAT 
GGATTCCTTC 
CCTTGACAGT 
CCCAACAAAA 
GGTGCTAAAA 
CAGAAAGTAA 
ATTACTCCAA 



GTGACTGCAA 
AAGATGAACA 
TACTGTATTT 
CTGTTGCTTA 
TCATTCAACT 
TGGCCAAGAA 
TGGAGAAGCC 
GTTCTAAGAT 
TGATTCAATT 
AGTATCACAT 
TAAAGTGTTT 



2460 
252C 
2580 
2540 
2700 



CAGCACTGGT 
TCCACCTGGC 
GACCATCTGT 
TGACCAGTAC 
AAAGAGCGAC 
CCAGCCAGGC 



GCAGATTGGC 
GCTGTCTCTG 
GTACTGGACA 
GCCTGAGACG 



TGGCTCTGAG 
GGATCCAATC 
AACGTCATCT 
CAATGGCTTA 
AAAACTGGAG 
GTACCCCTTC 



2940 
3000 
3060 
3120 
3180 
3240 
3300 
33S0 
3420 
348C 
3540 
3500 
3660 
3720 
3780 



3960 
4020 
4080 
4140 
4200 
4250 
4320 
4380 
4440 



Seq ID No: 117 Protein 



I 

MEGDRVAGRP 
AGESPALLTK 
AESCTERTPP 
NTFQAVLASD 
PYFSLTSTEQ 



LQLLMLRAAA 
TASSPLRTSP 
CALASRALRA 



GKRSMWTMIS 



J LGIPGVWAFH 



TLDPHTKEGT 
IQPYPDGGPV 
YNAANKETCE 
LHVGHTPVHF 
GSENGFSLAG 
TAHISPYKEL 
TTQQLNVDRV 
TGVDYTCECA 
HTCILITPPA 
RCHPAATCYN 
IPQCDEQGNF 
ERWRENLLEH 
TTPACIPTVA 
HGSIIVGIDY 
MYWTDSVLDK 
LDGENRRILI 
SIVSYADHFY 



PSEMDVPPAH I 



TDVDLHAYIV 
AAFTHDMEVT 
YHYSDSTVTS 
FALYNDEERV 
SGYQGDGRNC 
NPCEDGSHTC 
TPGSFSCRCQ 
LPLQCHGSTG 
YGGTPRDDQY 
PPMVRPTPRP 
DCRERMVYWT 



NTDIGLPNGL 



GNDGRAYTAI 
FYPGEETVRI 
TSSRDYSLTF 
LRFAVTNQIG 
VDENECATGF 
APAGQARCVH 
PGYYGDGFQC 
FCWCVDPDGH 
VPQCDDLGHF 
DVTPPSVGTF 
DVAGRTISRA 
RKVLFYTDLV 
TFDPFSKLLC 



RPKESYNVQL 
IGSTSPLDNV 
GEPEEALNGH 
ETRSPAPPEV 
PASGHTTPLS 
CHCQSKFYGN 
SHIPQPAAQA 
TQTAEGLDPE 
GAINQTWSYR 
PVKEDSDPTP 
HRCGPNSVCI 
HGGSTFSCAC 
IPDSTSSLTP 
EVPGTQTPPG 



PPTSRPSPLF WRTSTRATAE 
LGAGRRLRGG QTRALPSGEL 
QLPARVGFCR GEADDLKSEG 



SSIDVSFQSK VDTKPLEESS 



RC-TY3VGLED 



LLPLTPIGGL 
NYLSIKTNIQ 
IHQNITYQVC 
VNPCYDGSHM 
NLPGSYRCEC 
LPGYAGDGHQ 
CEQQQRHAQA 



YLPLNGTRLQ 
IWSGLISPE 
RGNLYWTDWN 
CTLPDGTGRR 
LYGITAVYPY 



NIGSNTEVFT 
HRVNGKVSGH 
FGWLFALEKP 
GQVPYVPANF 
RHAPRHPSFP 
CDTTARCHPG 
RSGYEFADDR 
CTDVDECSEN 
QYAYPGARFH 
EPTQRPPTIC 
EVQGTRSQPG 
KDAAKTLLSL 



Seq ID NO: 118 DNA sequence 

Nucleic Acid Accession #: NM_003088.1 

Coding sequence: 112-1593 (underlined sequences correspond t 



1200 
1250 
1320 



start and stop codons) 
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I 

GCGGAGGGTG 
CCCGCCACCC 
AACGGCACAG 
CTGACGGCCG 
CAGATCTGGA 
AGCCACCTGG 
GTGCCCGGTC 
CAGTCCGAGG 
CAGACGGTGT 
ATCTACAGTG 
GCCGTGGACC 
CAGCGCTACA 



CGTGCGGGCC 



I 



CCGAGGCGGT 
AGGCGTTCGG 
CGCTGGAGCA 
GCCGCTACCT 
CCGACTGCCG 
CGCACCGGCG 



GCCGCGCAGC 
GCAGATCCAG 
GTTCAAGGTG 
GCCCCCTGAC 
GGCGGCGGAC 
TTTCCTCATC 
CTACTTCGGC 



31 
I 

AACAAAGGAG 
GGCCTCTCGT 
TTCGGCCTCA 
AACGCGTCCG 
GAGGCGGGCA 
AAGGACGGCA 



CTACTGCCAC 
TCAACTGCGG 
CCAGCAGCCT 



CGCGACTGCG 
AAGGCCACCA 
GTGCTGCAGG 
AATCAGGACG 
AAGTGTGCCT 
CAGTCCACCG 
CGCATCACAC 
GCCGCCTCGG 
CCCATCATCG 
CTGGACGCCA 
AACATCAAAG 



TCACCCGTAA 
GCGACGTGCC 
GCGTGCAGAC 
AGCCGGCCAC 



ACCGTGGACC 
CCACATGGCG 
GGCGGGAGGC 
CCTGTCGCCC 
TCAGCGGCTG 
CGGGGCGAGT 
GAAGCGGCTA 
TTTGCCTCTC 
CTGTCAGTGG 
CGGGAGGGCT 
CTCCCACGTG 
ACAGGGTCTG 
GGGCCGTCTT 
CAAATCAGTA 
GTAGTAGCGA 
CCCCCTCTTT 
GCCAGAGCCC 
CGCCCCCTCC 
TCCCCAACAT 
TATAACTCTA 
AGTCTGC 



AGGTGGGCAA 
CGGCCAACGA 
AGGAGACCGA 
TCCGTACCCA 
CCTCCAGCAA 
TGAGGGCGTC 
TGGAGACAGC 
TGTTCCGCGG 
ACCGCTCCAG 
ACTCCACAGG 
CTCCTGTGGA 
GGCGCTACCT 
CCGCCTCGCT 
GCTCCTGCCA 
AAGCCCCCTT 
CTATGGACTC 
CGGCCTGGCC 
CTGGCACCTC 
AGGGACGGTT 
CCAGCCACCT 
CCCTCCCTGG 



TTTTTTTTAA 
GTGATCTGGC 
CCGTCCTTCC 
CTGCTGTGAT 
GGGAGCCCTG 
GCATCTCACT 



GCGCTACGCG 
CTGGGGCGTC 
CGCCGACCAC 
TGGCTACACG 
CCTGGCGCCG 
GGACGAGCTC 



GTGCACATCG 
CACCTGAGCG 
GACTCGCTCA 
CGCTTCCTGC 
CTGGAGTTCC 
TCGGGGCCCA 



ACC-TGACCTG 
ACGACGGTCG 
ACCGCCTGTC 
CCATGCACCC 

TCACCCTCGC 



51 
I 

GCCGCAGGGA SO 

CATGACCGCC 120 

CAACAAGTAC 180 

GAAGAAGAAG 240 

GTGCCTGCGC 3 00 

CGAGCGCGAG 350 



GAATGCCAGC 
CAATGGCAAG 
AGGGGACTCA 
GGAGCATGGC 
CTATGACGTC 
CAAATACTGG 
CTTCTTCTTC 
GAAGGGCGAC 
CTGGGAGTAC 



GGGGGCTGGG 
CCTCCCAGCC 
TGCACTGTCC 
CTTGTGGTGT 
AGCCTGGCTC 
GTTCTGCCAA 
CTTTCCTTTC 
TGAAATATTA 



TGCTACTTTG 
TTTGTGACCT 
GAGCTCTTCC 
TTCATCGGCT 
TTCCAGCTGG 
ACGGTGGGCA 
GAGTTCTGCG 
CACGCAGGCG 



GCTCCGGCAA 
GCGGCACGCT 
AGCAGAGCTG 
AGGGTATGGA 
AGATCGACCG 
TGACGGCCAC 
ACATCGAGTG 
CCAAGAAGAA 
TCATGAAGCT 



CTTCCAGGAC 
GCGCCTGGTG 
GGTGGCCTTC 



CGCCCAGGTC 
CCTGTCTGCC 
CGACACCAAA 
CGGGGGCGTG 
GCGTGACCGG 



GTGACTCCGC 



CGTCCAGCCC 
TGGTGCTCCC 
GGGTGAGCCG 
CTGGGTGTCT 
ATAGTAGCTT 



TTTCAGATGC 
CCTCAGACGG 
AGCCCTGGGC 
CCCCAGGAGA 
CCGAAACCCC 
TTTTTTGGGT 
CCTTCCCTGG 
GGTGGTGGTG 
ACCCTAGCCT 
TTGCTGGAGG 
TCAGCACCCT 
CAGCCCTGGG 
TGGGCCTCCC 
CCGGGGCCCC 
TGGTCTTTTA 
CAAACTGGAA 



CCGTCCTTCC 
CTCCGCCAGG 
CAGAGAAAAC 
GGTTCCCTAC 
CCCTGCCCTC 



CACGGGCACC 
TGGCGCCTAC 
GGTCACCAGC 
GGTGGCCATC 
CTCGGCGGAA 
CCGCCCCTGC 



GTGTAGTGTA 
GCTGGGCACA 
TGCTTGGGAA 
GGTGGCTGGA 
AGCGGCAGGG 



GACTGGAAGC 
CGTCCCAGGC 
CCCCAGGGGG 
CCTGGGCTGC 
GGGTGGATGA 
CCTGCTGCCA 
TTTTTTGTAA 
ATAGCGAAAT 



GGTGCCCCCA 
TCCCCTCGGG 
TTGTCTGCCA 
TATTTCTCTG 
ACTGGAATCT 
TGTCCCAAGC 
GGGAAGCTGT 
AACAGCCCCT 
CGTGACGGCC 
AGGGGTGTGG 
AGAAAATGAC 
AAGCCTGGCT 
TGCATCTCAG 
CGACACCTGG 
AGCCAGGCGT 
GCCTCCCCCG 
GTGTCATTTG 
AAAATAACTC 



1320 
1380 
1440 
1500 
15S0 
1S20 
1680 
1740 

I860 
1920 
1980 



2640 
2700 
2750 



11 



I 

QIQFGLINCG 
AADKDGNVTC 
KWSVHIAMHP 
ADHRFLRHDG 
DELFALEQSC 
TGKYWTLTAT 



I 



EREVPGPDCR 
QVNIYSVTRK 
RLVARPEPAT 
AQWLQAANE 
GGVQSTASSK 
INRPIIVFRG 
KYWTVGSDSA VTSSGDTPVD 



FKVNASASSL 
FLIVAHDDGR 
RYAHLSARPA 



I 

MTANGTAEAV 
CLRSHLGRYL 
CFAQTVSPAE 
FQDQRYSVQT 
KAGKATKVGK 
DTKKCAFRTH 
GQLAASVETA 
GAYNIKDSTG 
SAETVDPASL 



Seq ID NO: 120 DHA sequence 

Nucleic Acid Accession #: NM_006404.1 

Coding sequence: 25-741 (underlined sequences correspond t 



RNVSTRQGMD 
NASCYFDIEW 
EHGFIGCRKV 
FFFEFCDYNK 



41 51 
I I 
KKKQIWTLEQ P PD E AG S AAV 
WSLQSEAHRR YFGGTEDRLS 
DEIAVDRDVP WGVDSLITLA 
VAFRDCEGRY LAPSGPSGTL 
LSANQDEETD QETFQLEIDR 
RDRRITLRAS NGKFVTSKKN 
TGTLDANRSS YDVFQLEFND 
VAIKVGGRYL KGDHAGVLKA 



start and stop c 



I I I I I I 

CAGGTCCGGA GCCTCAACTT CAGGATGTTG ACAACATTGC TGCCGATACT GCTGCTGTCT 

GGCTGGGCCT TTTGTAGCCA AGACGCCTCA GATGGCCTCC AAAGACTTCA TATGCTCCAG 

ATCTCCTACT TCCGCGACCC CTATCACGTG TGGTACCAGG GCAACGCGTC GCTGGGGGGA 



236 
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CACCTAACGC 
TTGCAGGAGC 
TTCCACGGCC 



GTGGCTGTGA 
GACACCCAGG 
CGCACTCGGT 
CATATTTCCG 
CTGGGCGTCC 
ACAGGTGGAC 
GGCTGGCAAG 



GGAGATGGAG 
GCTTTGCTGA 
GAGTTGGGGC 
TCAAAAGATA 
CAGGTGTGTC 
GAAGTGGTGG 
AATATTAATA 



ACGTGCTGGA AGGCCCAGAC 
CCGAGAGCTG GGCGCGCACG 
TCGTGCGCCT GGTGCACCAG 
TGGGCTGTGA GCTGCCTCCC 
ATGGGAGCTC CTTTGTGAGT 
TCACCTCCGG AGTGGTCACC 
ATGAACTGCG GGAATTCCTG 
CGGAAAACAC GAAAGGGAGC 
TGGTGGGCGG TTTCATCATT 
GGCGATGTTA_ATTACTCTCC 
GGAAAGTTTC AGCTCACTGT 
TGACAGCTCC TTTCTTCTCC 
AGGAGAGGTG GACAAAGTAC 
ATTAGTCTGA TAAGTGAATG 
AGGAAGCCTA TGCGCCATCC 
TAACCAAATA AACAAGT CAT 
AGACTTGGGA TGGGACGCTG 
AAATGTAAAA TCCAAGTCAT 
AATTTCTTAT ATTT 



ACCAACACCA 
CAGAGTGGCC 
GAGCGGACCT 
GAGGGCTCTA 
TTCCGGCCGG 
TTCACCCTGC 
GAGGACACCT 
CAAACAAGCC 
GCTGGTGTGG 
AGCCCCGTCA 
GAAGCCAGAC 
CACATCTGCC 

TTTATCTATC 
TCCAAAGACA 
CCACAATCAA 
ATATAATAGG 



CGATCATTCA 
TGCAGTCCIA 
TGGCCTTTCC 
GAGCCCATGT 
AGAGAGCCTT 
AGCAGCTCAA 
GTGTGCAGTA 
GCTCCTACAC 
CT3TAGGCAT 
GAAGGGGCTG 
TCCCCAACTG 
CACTGAAGAT 
AAGAACCTAA 



GCTGCAGCCC 
CCTGCTCCAG 
TCTGACCATC 
CTTCTTCGAA 
GTGGCAGGCA 
TGCCTACAAC 
TGTGCAGAAA 
TTCGCTGGTC 
CTTCCTGTGC 
GATTGATGGA 
AAACACCAGA 



GACAGAATCA 
AATACAACAT 
GTAGAAAGAA 
TCAATTATTA 



GAACGTGTAT 
ACAGATAATG 
CCTGAGGCGT 
TCAATACTTC 
GTAACACGAA 
ATCAATTAAT 



1020 
1080 
1140 
1200 
12S0 



21 



51 



I I I I I I 

MLTTIiLPILL LSGWAFCSQD ASDGLQRLHM LQISYFRDPY HVWYQGNASL GGHLTHVLEG 

PDTNTTIIQL QPLQEPESWA RTQSGLQSYL LQFHGLVRLV HQERTLAFPL TIRCFLGCEL 

PPEGSRAHVF FEVAVNGSSF VSFRPERALW QADTQVTSGV VTFTLQQLNA YNRTRYELRE 

FLEDTCVQYV QKHISAENTK GSQTSRSYTS LVLGVLVGGF IIAGVAVGIF LCTGGRRC 

Seq ID NO: 122 DMA sequence 
Nucleic Acid Accession #: none found 

Coding sequence: 2-505 (underlined sequences correspond t 



start and stop codons) 



I 

CGAGAAGCTG 
TGAGATTCCT 
CGAGTCAAAG 
TTCCTCTGCC 
CACAGCAGTA 
CTTTCACGAA 
GAGTGATCCT 
GAAAGTCGGG 
TCTTGGCTCT 
CACTTTTGAT 
AAATCCCCCT 
TCCTTCCCTG 
GTAGTGCTGG 
CTTTTCAAGA 
ACCAAAATGG 
CCAGGGAAAA 



11 21 
I I 
GGAGAGACAC - ~ 
CGATGGGGAT CACAGAGCAC 
GCCACTATCA CCCCATCAGG 
ACTCCTCAGG CTTTCGACTC 
GTAGTGTTGG TGATCTTGAC 
AGCCCCTCTT CCCAGCCAAG 
GAGCCCGCTG CTTTGGGCTC 
GACTGTGATC TGCGGGACAG 
AGTGATGC AT AG GGAAACAG 
GAAACGGGGA ACCAAGAGGA 
TCCTCTAAAT TCCCTTTACT 
ATGATAGAGG AAGTGGAAGT 
GGAGAGATAT TTTCTTATGT 
CATTGGAAAC AAATAGAACA 
AAAGGAAATG TTCTATGTTG 
AAATAAAAAT AAAAAATTAA 



I 

TGAACAAGAC 
GATGTCTACC 
GAGCGTGATT 
CTCCTCTGCC 
CATGACAGTA 
GAAGGAGTCT 
CAGTTCTGCA 
AGCAGAGGGT 
GGGACATGGG 
ACTTACTTGT 
CCACTGAGGA 
GCCTTTAGGA 
TTATTCGGAG 
CAATATAATT 
TTCAGGCTAG 
AGGATTGTTG 



41 

I 

AATTCAGTAA 
CTTCAAATGT 
TCCAAGTTTA 
GTGGTCTTCA 
CTGGGGCrTG 
ATGGGCCCGC 
CATTGCACAA 
GCCTTGCTGG 
CACTCCTGTG 



CATCTATTCC 
CCCTTCAAGC 
ATTCTACGAC 
TATTTGTGAG 



CGGAGTCCCC 4 30 



GCTAAATCAG 
TGGTGATACT 
AATTTGGAGA 
TACATTAAAA 



AGTGATTGAA 
AATAATTTCT 
GTTCGAAATC 



55 Seq ID No: 123 Protein 

Protein Accession #: none round 
1 11 21 31 41 51 

I I I I I I 

EKLGETPLVP EQDNSVTSIP EIPRWGSQST MSTLQMSLQA ESKATITPSG SVISKFNSTT 

60 SSATPQAFDS SSAWFIFVS TAVWLVILT MTVLGLVKLC FHESPSSQPR KESMGPPGLE 
SDPEPAALGS SSAHCTNNGV KVGDCDLRDR AEGALLAESP LGSSDA 



Nucleic Acid Accession #: NM_006500.1 

Coding sequence: 27-1967 (underlined sequences correspond t 



■ start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ACTTGCGTCT CGCCCTCCGG CCAAGCATGG GGCTTCCCAG GCTGGTCTGC GCCTTCTTGC 
TCGCCGCCTG CTGCTGCTGT CCTCGCGTCG CGGGTGTGCC CGGAGAGGCT GAGCAGCCTG 
CGCCTGAGCT GGTGGAGGTG GAAGTGGGCA GCACAGCCCT TCTGAAGTGC GGCCTCTCCC 
AGTCCCAAGG CAACCTCAGC CATGTCGACT GGTTTTCTGT CCACAAGGAG AAGCGGACGC 
TCATCTTCCG TGTGCGCCAG GGCCAGGGCC AGAGCGAACC TGGGGAGTAC GAGCAGCGGC 
TCAGCCTCCA GGACAGAGGG GCTACTCTGG CCCTGACTCA AGTCACCCCC CAAGACGAGC 
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GCATCTTCTT GTGCCAGGGC AAGCGCCCTC GGTCCCAGGA GTACCGCATC CAGCTCCGCG 42 0 

TCTACAAAGC TCCGGAGGAG CCAAACATCC AGGTCAACCC CCTGGGCATC CCTGTGAACA 4 80 

GTAAGGAGCC TGAGGAGGTC GCTACCTGTG TAGGGAGGAA CGGGTACCCC ATTCCTCAAG 540 

TCATCTGGTA CAAGAATGGC CGGCCTCTGA AGGAC-GAGAA GAACCGGGTC CACATTCAGT 600 

5 CGTCCCAGAC TGTGGAGTCG AGTGGTTTGT ACACCTTGCA GAGTATTCTG AAGGCACAGC 660 

TGGTTAAAGA AGACAAAGAT GCCCAGTTTT ACTGTGAGCT CAACTACCGG CTGCCCAGTG 72 0 

GGAACCACAT GAAGGAGTCC AGGGAAGTCA CCGTCCCTGT TTTCTACCCG ACAGAAAAAG 780 

TGTGGCTGGA AGTGGAGCCC GTGGGAATGC TGAAGGAAGG GGACCGCGTG GAAATCAGGT 840 

GTTTGGCTGA TGGCAACCCT CCACCACACT TCAGCATCAG CAAGCAGAAC CCCAGCACCA 900 

10 GGGAGGCAGA GGAAGAGACA ACCAACGACA ACGGGGTCCT GGTGCTGGAG CCTGCCCGGA 960 

AGGAACACAG TGGGCGCTAT GAATGTCAGG CCTGGAACTT GGACACCATG ATATCGCTGC 102 0 

TGAGTGAACC ACAGGAACTA CTGGTGAACT ATGTGTCTGA CGTCCGAGTG AGTCCCGCAG 10 80 

CCCCTGAGAG ACAGGAAGGC AGCAGCCTCA CCCTGACCTG TGAGGCAGAG AGTAGCCAGG 1140 

ACCTCGAGTT CCAGTGGCTG AGAGAAGAGA CAGACCAGGT GCTGGAAAGG GGGCCTGTGC 12 00 

15 TTCAGTTGCA TGACCTGAAA CGGGAGGCAG GAGGCGGCTA TCGCTGCGTG GCGTCTGTGC 1260 

CCAGCATACC CGGCCTGAAC CGCACACAGC TGGTCAAGCT GGCCATTTTT GGCCCCCCTT 132 0 

GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTGTCTT 13 8 0 

GTGAAGCGTC AGGGCACCCC CGGCCCACCA TCTCCTGGAA CGTCAACGGC ACGGCAAGTG 1440 

AACAAGACCA AGATCCACAG CGAGTCCTGA GCACCCTGAA TGTCCTCGTG ACCCCGGAGC 1S00 

20 TGTTGGAGAC AGGTGTTGAA TGCACGGCCT CCAACGACCT GGGCAAAAAC ACCAGCATCC 1560 

TCTTCCTGGA GCTGGTCAAT TTAACCACCC TCACACCAGA CTCCAACACA ACCACTGGCC 162 0 

TCAGCACTTC CACTGCCAGT CCTCATACCA GAGCCAACAG CACCTCCACA GAGAGAAAGC 168 0 

TGCCGGAGCC GGAGAGCCGG GGCGTGGTCA TCGTGGCTGT GATTGTGTGC ATCCTGGTCC 174 0 

TGGCGGTGCT GGGCGCTGTC CTCTATTTCC TCTATAAGAA GGGCAAGCTG CCGTGCAGGC 18 00 

25 GCTCAGGGAA GCAGGAGATC ACGCTGCCCC CGTCTCGTAA GACCGAACTT GTAGTTGAAG 1860 

TTAAGTCAGA TAAGCTCCCA GAAGAGATGG GCCTCCTGCA GGGCAGCAGC GGTGACAAGA 1920 

GGGCTCCGGG AGACCAGGGA GAGAAATACA TCGATCTGAG GCATTAGCCC CGAATCACTT 1980 

CAGCTCCCTT CCCTGCCTGG ACCATTCCCA GCTCCCTGCT CACTCTTCTC TCAGCCAAAG 204 0 

CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG 210 0 

30 GGCCACTGGG TTAGGACCTG AGGACCTCAC TTGGCCCTGC AAGCCGCTTT TCAGGGACCA 2160 

GTCCACCACC ATCTCCTGCA CGTTGAGTGA AGCTCATCCC AAGCAAGGAG CCCCAGTCTC 222 0 

CCGAGCGGGT AGGAGAGTTT CTTGCAGAAC GTGTTTTTTC TTTACACACA TTATGGCTGT 22 80 

AAATACCTGG CTCCTGCCAG CAGCTGAGCT GGGTAGCCTC TCTGAGCTGG TTTCCTGCCC 234 0 

CAAAGGCTGG CTTCCACCAT CCAGGTGCAC CACTGAAGTG AGGACACACC GGAGCCAGGC 24 00 

35 GCCTGCTCAT GTTGAAGTGC GCTGTTCACA CCCGCTCCGG AGAGCACCCC AGCGGCATCC 2460 

AGAAGCAGCT GCAGTGTTGC TGCCACCACC CTCCTGCTCG CCTCTTCAAA GTCTCCTGTG 2520 

ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGTGCCGG 258 0 

GGCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA GGCGGGCGGA 264 0 

TCACAAAGTC AGGACGAGAC CATCCTGGCT AACACGGTGA AACCCTGTCT CTACTAAAAA 270 0 

40 TACAAAAAAA AATTAGCTAG GCGTAGTGGT TGGCACCTAT AGTCCCAGCT ACTCGGAAGG 2760 

CTGAAGCAGG AGAATGGTAT GAATCCAGGA GGTGGAGCTT GCAGTGAGCC GAGACCGTGC 2 820 

CACTGCACTC CAGCCTGGGC AACACAGCGA GACTCCGTCT CGAGGAAAAA AAAAGAAAAG 2 8 80 

ACGCGTACCT GCGGTGAGGA AGCTGGGCGC TGTTTTCGAG TTCAGGTGAA TTAGCCTCAA 2940 

TCCCCGTGTT CACTTGCTCC CATAGCCCTC TTGATGGATC ACGTAAAACT GAAAGGCAGC 300 0 

45 GGGGAGCAGA CAAAGATGAG GTCTACACTG TCCTTCATGG GGATTAAAGC TATGGTTATA 306 0 

TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC CCTAGAAGGG CCCAAATGAG 312 0 

AGAATGGTAC TTAGGGATGG AAAACGGGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTGT 3180 

CTGTGTGTAT GCATACATAT GTGTGTATAT ATGGTTTTGT CAGGTGTGTA AATTTGCAAA 3240 

TTGTTTCCTT TATATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT 3300 

50 AAAGCTTAAT TGTCCCAGAA AATCATACAT TGCTTTTTTA TTCTACATGG GTACCACAGG 3360 

AACCTGGGGG CCTGTGAAAC TACAACCAAA AGGCACACAA AACCGTTTCC AGTTGGCAGC 342 0 

AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCA AGCTCTACCA GAGCAGACAG 34 80 

CTACCCTACT TTTCAGCAGC AAAACGTCCC GTATGACGCA GCACGAAGGG CCTGGCAGGC 3540 
TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTCCGTCCA CTT 

55 

Seq ID No: 125 Protein sequence: 
Protein Accession #: NP_006491.1 



60 1 11 21 31 41 51 

I I I I I 1 

MGLPRLVCAF LLAACCCCPR VAGVPGEAEQ PAPELVEVBV GSTALLKCGL SQSQGNLSHV 60 

DWFSVHKEKR TLIFRVRQGQ GQSEPGEYEQ RLSLQDRGAT LALTQVTPQD ERIFLCQGKR 12 0 

PRSQEYRIQL RVYKAPEEPN IQVMPLGIPV NSKEPEEVAT CVGRNGYPIP QVIWYKNGRP 180 

65 LKEEKNRVHI QSSQTVESSG LYTLQSILKA QLVKEDKDAQ FYCELNYRLP SGNHMKESRE 240 

VTVPVFYPTE KVWLEVEPVG MLKEGDRVEI RCLADGNPPP HFSISKQNPS TREAEEETTN 300 

DNGVL VLE PA RKEHSGRYEC QAWNLDTMIS LLSEPQELLV NYVSDVRVSP AAPERQEGSS 360 

LTLTCEAESS QDLEFQWLRE ETDQVLERGP VLQLHDLKRE AGGGYRCVAS VPSIPGLNRT 420 

QLVKLAIFGP PWMAFKERKV WVKENMVLNL SCEASGHPRP TISVINVNGTA SEQDQDPQRV 480 

70 LSTLNVLVTP ELLETGVECT ASNDLGKNTS ILFLELVNLT TLTPDSNTTT GLSTSTASPH 540 

TRANSTSTER KLPEPESRGV VIVAVIVCIL VLAVLGAVLY FLYKKGKLPC RRSGKQEITL 600 
PPSRKTELW EVKSDKLPEE MGLLQGSSGD KRAPGDQGEK YIDLRH 

Nucleic Acid Accession #: NM_0 01955.1 



238 



WO 02/079492 



PCT/US02/04915 



Coding sequence: 337-975 (underlined sequences correspond t 



start and stop'codons) 



1 




21 


31 




51 




1 

GGAGCTGTTT 


1 

ACCCCCACTC 


1 

TAATAGGGGT 


1 

TCAATATAAA 


1 

AAGCCGGCAG 


1 

AGAGCTGTCC 


60 


AAGTCAGACG 


CGCCTCTGCA 


TCXGCGCCAG 


GCGAACGGGT 


CCTGCGCCTC 


CTGCAGTCCC 


120 


AGCTCTCCAC 


CACCGCCGCG 


TGCGCCTGCA 


GACGCTCCGC 


TCGCTGCCTT 


CTCTCCTGGC 


iao 


AGGCGCTGCC 


TTTTCTCCCC 




ACTTGGGCTG 


AAGGATCGCT 


TTGAGATCTG 


240 


AGGAACCCGC 


AGCGCTTTGA 


GGGACCTGAA 


GCTGTTTTTC 


TTCGTTTTCC 


TTTGGGTTCA 


300 


GTTTGAACGG 


GAGGTTTTTG 


ATCCCTTTTT 


TTCAGAATGG 


ATTATTTGCT 


CATGATTTTC 


360. 


TCTCTGCTGT 


TTGTGGCTTG 


CCAAGGAGCT 


CCAGAAACAG 


CAGTCTTAGG 


CGCTGAGCTC 


420 


AGCGCGGTGG 


GTGAGAACGG 


CGGGGAGAAA 


CCCACTCCCA 


GTCCACCCTG 


GCGGCTCCGC 


480 


CGGTCCAAGC 


GCTGCTCCTG 


CTCGTCCCTG 


ATGGATAAAG 


AGTGTGTCTA 


CTTCTGCCAC 


540 


CTGGACATCA 


TTTGGGTCAA 


CACTCCCGAG 


CACGTTGTTC 


CGTATGGACT 


TGGAAGCCCT 




AGGTCCAAGA 


GAGCCTTGGA 


GAATTTACTT 


GCCACAAAGG 






SSO 


TGCCAATGTG 


CTAGCCAAAA 


AGACAAGAAG 


TGCTGGAATT 


TTTGCCAAGC 


AGGAAAAGAA 


720 


CTCAGGGCTG 


AAGACATTAT 


GGAGAAAGAC 


TGGAATAATC 


ATAAGAAAGG 


AAAAGACTGT 




TCCAAGCTTG 


GGAAAAAGTG 


TATTTATCAG 


CAGTTAGTGA 


GAGGAAGAAA 


AATCAGAAGA 


840 


AGTTCAGAGG 


AACACCTAAG 


ACAAACCAGG 


TCGGAGACCA 


TGAGAAACAG 


CGTCAAATCA 


900 


TCTTTTCATG 


ATCCCAAGCT 


GAAAGGCAAG 


CCCTCCAGAG 


AGCGTTATGT 


GACCCACAAC 


960 


CGAGCACATT 


GGTGA.CAGAC 




GTCTGAAGCC 


ATAGCCTCCA 


CGGAGAGCCC 


1020 


TGTGGCCGAC 


TCTGCACTCT 


CCACCCTGGC 


TGGGATCAGA 


GCAGGAGCAT 


CCTCTGCTGG 


1080 


TTCCTGACTG 


GCAAAGGACC 


AGCGTCCTCG 


TTCAAAACAT 




GTTAAGGAGT 


1140 


TCCCCCAACC 


ATCTTCACTG 


GCTTCCATCA 


GTGGTAACTG 


CTTTGGTCTC 


TTCTTTCATC 


1200 


TGGGGATGAC 


AATGGACCTC 


TCAGCAGAAA 


CACACAGTCA 


CATTCGAATT 


C 




Seq ID No: 


12 7 Protein sequence: 











Protein Accession #: NP_001946.1 



21 



31 



I I I 

MDYLLMIFSL LFVACQGAPE TAVLGAELSA 
KECVYFCHLD IIWVNTPEHV VPYGLGSPRS KRALENLLPT 
NFCQAGKELR AEDIMEKDWN NHKKGKDCSK LGKKCIYQQL 
TMRNSVKSSF HDPKLKGKPS RERYVTHNRA HW 



Seq ID NO: 128 DNA sequence 

Nucleic Acid Accession #: NM_001721.1 

Coding sequence: 34-2061 (underlined sequences correspond to start and stop codons) 



I 



CTTCTTCTCA 
CTTTTTGTTT 
AGCAGAAAAG 
GAGCAGACGC 
TATGTCTATG 



TTCCTGTGTT 
GCTAATCTGC 
GTGCTGAAGA 
ACTCTAGCCC 



TTCAACATGC 
AAAAGTAGCA 
CACACCACCT 



11 

I 

AACAAGCTGA 
AAAGATCACA 
TGACCAAAAC 
GATCCATTGA 
CTGTAGAGAG 
CATCAAATGA 
ACCCCCACCT 
GCCAGCAGAG 
ATACTGCAGT 
TACCTCGGGC 
AATATGACAA 
TAGCGCAATA 
AGTATATTCC 
GCAGCAGTGA 
CAAAGATTTC 
ATTATGACTG 



I 



GCAAAAGAAG 
AAACCTTTCC 
AATTAAGAAA 
ACAGTACCCA 
AGAGAGCCGA 
GCTGGTCAAG 
CTGTAAAGCA 
CAATGAAGAG 
AGTTCCTGTT 
CGAATCAAAG 



TTTCAGATTG 
AGTCAGTGGT 
TACCATAGTG 



AAGGGAAGAC 



CTCAGACAAA 
TACACAGTGT 
CACGTGCATA 
ATTCCAAAGC 
CACCCTGTGT 
TGGGAACTGA 



CCTTATTTAG 
CAAATGCTGA 
TTATTCATTA 
CAACAAAGGC 
AAAGAGAAGA 



GGCTCCATGT 
CCCAAGCTGG 
GAATATATAA 
CCTTCCCAGC 
CACCAATTCA 
GTGAAAGTAT 



TTAAATTCTA 
GCAATGGCTG 
TCTTAGAAAT 
TACACCGGGA 
CTGACTTTGG 



ATGGGAATTC 
GTTTGCTGGT 
AGGAGCATTT 
TAAGGCTGTG 
GAACAAATTA 
TCATCAACAC 
CAACAAGGTC 
GATTACCTTG 
GAAGGGGCAG 
ATTCTTTCAG 



AAACACAGAG 
CTCAAAATGG 
AAAAACTATG 
TCAAAGAAAA 
TTCCCTGACT 
AGCAGTAACC 
CCTGAGTCAA 
AACATCTCCA 
ATGGTTAGAA 
AATGATAAAA 



CAAAATCTAT 
CAAATAATTA 
ATGACAAAAT 
TGGAGAAAGT 
TCTATAAAGA 
TGAAAGCATT 
GGTTCTTCGT 
GTACCCTCTG 
TTCCCACCTT 
ATGCACCATC 
GCTCCCAGCC 
TCTATGGCTC 
GGTGGCAAGT 
AAAAAGAAAG 
GTTCATCTGA 
GATCACAATC 
ATTCGAGCCA 
AAGGAACTGT 
AAAACTACTG 
GCATGATCAC 



TCTAGAAGAA 



GAAAAGGGGC 
AAATCTCGAG 
TGGGCTTCTC 
ACAAAAAGAG 



TTCAAGTACC 
ACCATCTTCA 
CCAGCCAAAC 
AAGAAAACTG 
AAATGTGAAT 
AGAAGAGGAA 
TGAACAGTTA 



CTTGCTGAAT 



TATGATGTTG CTGTTAAGAT 
GAGGCCCAGA CTATGATGAA 
TCAAAGGAAT ACCCCATATA 
TACCTGAGGA GTCACGGAAA 
GTCTGTGAAG GCATGGCCTT 
CGTAACTGCT TGGTGGACAG 
TATGTTCTTG ATGACCAGTA 



CAAACATTAC 
TTTTGATTCC 
ACGGCTCCGC 
AAATGGAATC 
CCAGTTTGGA 
GATCAAGGAG 
ACTCAGCCAT 
CATAGTGACT 
AGGACTTGAA 
CTTGGAGAGT 
AGATCTCTGT 
TGTCAGTTCA 



1020 
1080 
1140 
1200 

1320 
1380 
1440 
1500 
15S0 
1520 
1E80 
1740 
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GTCGGAACAA 
AGCAGCAAGT 
AAGCAGCCCT 
AGGCTTTACC 
CACGAGCTTC 
CGGGAAAAAG 
CTGGCCAGCA 
TTTTAATAGT 
ATTCCCTTGA 
TTTCCAGCCT 
AAAGACTGAG 
ATTGTCACAA 



GGCATCGGAC 
CAGAAAAGCG TCCCACATTT 
ACAAGCAT TG AA GAAGAAAT 
TTTTCATTCA TTTTAAGGAA 
GTTCTCTGTA TTGTCTATTA 
AATTTAGATC AAATTAGTAA 
ATAGCAGAAG CACATTTTCA 
CAGAACTGAA AAATTACTTA 
CAATTAAATA TACTACCAAG 



GCTCCAGAGG 
ATCCTGATGT 
CAGGTGGTTC 
ACCATCTACC 
CAGCAACTCC 
TAGGAGTGCT 
AGTAGGAAGG 
TTTAGAAATG 



GACTGCAATA 
TTGGATATTC 
TACAGAAATG 



TGTTTCATTA CTTCAAATAC 
CAGCCTGGGG 
CCAGGGCCAC 
CAGCTGCTGG 
TGAACCACTT 
AATATAGATG 
TTTAGCTAGT 
GAAACAAAAG 
GATATAACAC 
GTTCATGTGT 
TTATATTGTC 
AAACCG 



TGAAGGTCTC 
AGATCATGTA 
TGTCTTCCAT 
GATAAGAATG 
CATAAGTAAT 
AACAAGGCAG 
TGCTGCTCCT 
TAGAGACTGT 
ATTCTTTTCT 



1920 
1980 
2040 
2100 
21S0 
2220 
2280 
2340 
2400 



MDTKSILEEL 
RCVEKVNLEE 
HSGFFVDGKF 
KMDAPSSSTT 
PDWWQVRKLK 



LLKRSQQKKK 
QTPVERQYPF 
LCCQQSCKAA 
LAQYDNESKK 
SSSSSEDVAS 



LAEHYCFDSI 
KELGSGQFGV 
KEYPIYIVTE 
NCLVDRDLCV 
LMWEVFSLGK 
QLLSSIEPLR 



PKLIHYHQHN 
VQLGKWKGQY 
YISNGCLLNY 
KVSDFGMTRY 
QPYDLYDNSQ 
EKDKH 



Q1VYKDGLLY 
PGCTLWEAYA 
NYGSQPPSSS 
SNQKERNVMH 
VRNSSQVGMY 
SAGMITRLRH 
DVAVKMIKEG 



31 

I 

FVLTKTNLSY 
VYASNEESRS 
NLHTAVNEEK 
TSLAQYDSNS 
TTSKISWEFP 
TVSLFSKAW 
PVSTKANKVP 
SMSEDEFFQE 
SQLLEMCYDV 
GTKFPVKWSA 
LYRPHLASDT 



I 

• YEYDKMKRGS 
QWLKALQKEI 
HRVPTFPDRV 
KKIYGSQENF 
ESSSSEEEEN 
DKKGTVKHYH 
DSVSLGNGIW 
AQTMMKLSHP 



RKGSIEIKKI 



LKIPRAVPVL 
NMQY1PREDF 
LDDYDWFAGN 
VHTNAENKLY 
ELKREEITLIi 
KLVKFYGVCS 
QFIHRDLAAR 



i ELPEKRPTFQ 660 



Seq ID NO: 130 DMA sequence 

Nucleic Acid Accession #: NM_012072.2 

Coding sequence: 149-2107 (underlined 



> start and stop codons) 



I 

AAAGCCCTCA 
CCCCTTGGGG 
TCCCGCAGAG 
GCTGCTGCTC 
CGTGGGGACC 
CCACTGCAAC 
CGTCCAGCGA 
CAAGTTCTGG 
GAAGGGCTTC 
GCTCCGGAAC 
GCTCCTTCCC 
CGGAAGTAAC 
GGCCCTGGGG 
CTTGGAGGCT 
CGAGACTCAG 
CAGCTCGGGC 
CCACCAGGAC 
CCGGCTGCTG 
TCGTGGGGGG 
CCAAGGGTAC 
CTCCCCCTGT 
TGGCTATGAG 
GGGTCGCTCG 
TGAGGAGGGC 
TGTGGGCCCG 
CTGTGGCTGC 
TGTGTCTCTG 
GAGCACCGTG 
GGCTACACCC 
ACTCAAGATG 
CGCCACAGCT 
AAACAACGAT 
GGCCATCCTA 



11 

I 

GCCTTTGTGT 
CCCAGCTGGG 
GGCCACACAG 
CTGACCCAGC 
GCCTGCTACA 



31 



GTACTGGCCC 
ATTGGGCTCC 
AGCTGGGTGG 
TCGTGCATCT 
AACCGCCTGC 
ATTGAGGGCT 
GGCCCAGGTC 
GTGCCCTTTG 
AGTCATTATT 
CCCCTCTGTG 
TGCTTTGAAG 
GATGACCTGG 
GCCACGTGCG 
CAGCTGGACT 
GCCCAGGAGT 
CCGGGCGGTC 
CCTTGCGCCC 
TACGTCCTGG 
GGGGGCCCCC 
CTGCCAGGCT 
GGACCACCAT 
CCCCGCGCTG 
ACCACAAGTA 
CTGGCCCCCA 
GCCTCTGGCC 



CCTTCTCTGC GCCGGAGTGG 
AGCCGAGATA GAAGCTCCTG 
AGACCGGG AT G GCCACCTCC 
CCGGGGCGGG GACGGGAGCT 
CGGCCCACTC GGGCAAGCTG 
GCAACCTGGC CACTGTGAAG 
AGCTCCTGAG GCGGGAGGCA 
AGCGAGAGAA GGGCAAGTGC 
GCGGGGGGGA GGACACGCCT 
CCAAGCGCTG TGTGTCTCTG 
CCAAGTGGTC TGAGGGCCCC 
TCGTGTGCAA GTTCAGCTTC 
AGGTGACCTA CACCACCCCC 
CCTCTGCGGC CAATGTAGCC 
TCCTGTGCAA GGAGAAGGCC 
TCAGCCCCAA GTATGGCTGC 
GGGGGGATGG CTCCTTCCTC 
TGACCTGTGC CTCTCGAAAC 
TCCTGGGACC CCATGGGAAA 
CGAGTCAGCT GGACTGTGTG 
GTGTCAACAC CCCTGGGGGC 
CTGGAGAGGG 
AGGGCTGCAC 
CCGGGGAGGA CGGGACTCAG 
TCTGCGACAG CTTGTGCTTC 
GGGTGCTGGC CCCAAATGGG 



AGCGCTGCCG 
AGCAAGGAGG 
GCCCTGACGG 
CTGGACCCTA 
TACTCTAACT 
CTGCTGGACC 
TGTGGGAGCC 
AAAGGCATGT 
TTCCAGACCA 
TGTGGGGAAG 
CCCGATGTGT 
AACTTCAACA 



AACTACACGT 
GACGTGGATG 
TTCCGCTGCG 
GATGTGGATG 



CCCCTCAGCT 
GCTTCTCGCC 
TGCTGCTGCT 
CGGTGGTCTG 
AGGCCCAGAA 
AGC-CCCAGCA 
CGAGGATGAG 
GTCTGCCGCT 



TGTCCCAGCC 
CAGGCTCCCC 
GCCGGCCTCT 
CCAGTTCCTC 



CTCCTGCTGG 
GAGAAGAAGG 
GCTGAGAGCA 



CAACAGCCAG 
GACCTTCGCT 
GTGGGTCCTC 
CCCAGGAGCC 
GGCAAAAGCT 
CCCTGGCTCT 
AGAAGAAGCC 
GGGCCATGGA 
CTAGAGACAC 



TCCCACAAGG 
GTCATCTGAC 
AGGCGTCTGG 
TGCAGGTGGG 
GCTTTTATTC 
GGGGCTACTG 
CCAGAATGCG 
GAACCAGTAC 
TAGAGTCACC 



TGCCAGGACG 
AACACACAAG 
GTCTCTTGCA 
GACAAAGGAG 



AGGGAGCCCA 



TACATCCTAG 
GTCTATCGCA 
GCAGACAGTT 
AGTCCGACAC 
AGCCACCATC 



AATGCTGGGT 
AGTGTGCTCT 
ACTGCTCCTG 
TGGATGAGTG 
GGTCCTTCCA 
CCATGGGGCC 
AGAAAGAAGG 
GCACCCCCAA 
CATCTGCCCC 
GCATCCATCA 
TGGCCACACA 
GCACCGTGGT 
AGCGGAGAGC 
ACTCCTGGGT 
CTGGGACAGA 
CTCAGAGCTT 



1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
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Z ATTCCAAAGG GGCACCCACA TTTTTTTGAA AGACTGGACT GGAATCTTAG 2220 

CAAACAATTG TAAGTCTCCT CCTTAAAGGC CCCTTGGAAC ATC-CAGGTAT TTTCTACGGG 2280 

TGTTTGATGT TCCTGAAGTG GAAGCTGTGT GTTGGCGTGC CACGGTGGGG ATTTCGTGAC 2340 

TCTATAATGA TTGTTACTCC CCCTCCCTTT TCAAATTCCA ATGTGACCAA TTCCGGATCA 2400 

GGGTGTGAGG AGGCTGGGGC TAAGGGGCTC CCCTGAATAT CTTCTCTGCT CACTTCCACC 24 SO 

ATCTAAGAGG AAAAGGTGAG TTGCTCATGC TGATTAGGAT TGAAATGATT TGTTTCTCTT 2520 

CCTAGGATGA AAACTAAATC AATTAATTAT TCAATTAGGT AAGAAGATCT GGTTTTTTGG 25 80 

TCAAAGGGAA CATGTTCGGA CTGGAAACAT TTCTTTACAT TTGCATTCCT CCATTTCGCC 2S40 

AGCACAAGTC TTGCTAAATG TGATACTGTT GACATCCTCC AGAATGGCCA GAAGTGCAAT 2700 

TAACCTCTTA GGTGGCAAGG AGGCAGGAAG TGCCTCTTTA GTTCTTACAT TTCTAATAGC 27S0 

CTTGGGTTTA TTTGCAAAGG AAGCTTGAAA AATATGAGAA AAGTTGCTTG AAGTGCATTA 2 820 

CAGGTGTTTG TGAAGTCACA TAATCTACGG GGCTAGGGCG AGAGAGGCCA GGGATTTGTT 2 8 80 

CACAGATACT TGAATTAATT CATCCAAATG TACTGAGGTT ACCACACACT TGACTACGGA 2940 

TGTGATCAAC ACTAACAAGG AAACAAATTC AAGGACAACC TGTCTTTGAG CCAGGGCAGG 3000 

CCTCAGACAC CCTGCCTGTG GCCCCGCCTC CACTTCATCC TGCCCGGAAT GCCAGTGCTC 30SO 

CGAGCTCAGA CAGAGGAAGC CCTGCAGAAA GTTCCATCAG GCTGTTTCCT AAAGGATGTG 3120 

TGAACGGGAG ATGATGCACT GTGTTTTGAA AGTTGTCATT TTAAAGCATT TTAGCACAGT 3180 

TCATAGTCCA CAGTTGATGC AGCATCCTGA GATTTTAAAT CCTGAAGTGT GGGTGGCGCA 3240 

CACACCAAGT AGGGAGCTAG TCAGGCAGTT TGCTTAAGGA ACTTTTGTTC TCTGTCTCTT 3300 

TTCCTTAAAA TTGGGGGTAA GGAGGGAAGG AAGAGGGAAA GAGATGACTA ACTAAAATCA 3 3 SO 

TTTTTACAGC AAAAACTGCT CAAAGCCATT TAAATTATAT CCTCATTTTA AAAGTTACAT 3420 

TTGCAAATAT TTCTCCCTAT GATAATGCAG TCGATAGTGT GCACTCTTTC TCTCTCTCTC 34 80 

TCTCTCTCAC ACACACACAC ACACACACAC ACACACACAC AGAGACACGG CACCATTCTG 3 540 

CCTGGGGCAC TGGAACACAT TCCTGGGGGT CACCGATGGT CAGAGTCACT AGAAGTTACC 3S00 

2 TGGGAGGCCT CATGTCTCCT GTGGGCTTTT TACCACCACT GTGCAGGAGA 3 6 SO 

3 GAAATGTGTC TCCCTCCAAG GCCCCAAAGC CTCAGAGAAA GGGTGTTTCT 3720 
GGTTTTGCCT TAGCAATGCA TCGGTCTCTG AGGTGACACT CTGGAGTGGT TGAAGGGCCA 3 780 
CAAGGTGCAG GGTTAATACT CTTGCCAGTT TTGAAATATA GATGCTATGG TTCAGATTGT 3 840 
TTTTAATAGA AAACTAAAGG GGCAGGGGAA GTGAAAGGAA AGATGGAGGT TTTGTGCGGC 3900 
TCGATGGGGC ATTTGGAACT TCTTTTTAAA GTCATCTCAT GGTCTCCAGT TTTCAGTTGG 39S0 
AACTCTGGTG TTTAACACTT AAGGGAGACA AAGGCTGTGT CCATTTGGCA AAACTTCCTT 4 020 
GGCCACGAGA CTCTAGGTGA TGTGTGAAGC TGGGCAGTCT GTGGTGTGGA GAGCAGCCAT 4080 
CTGTCTGGCC ATTCAGAGGA TTCTAAAGAC ATGGCTGGAT GCGCTGCTGA CCAACATCAG 4140 
CACTTAAATA AATGCAAATG CAACATTTCT CCCTCTGGGC CTTGAAAATC CTTGCCCTTA 4200 
TCATTTGGGG TGAAGGAGAC ATTTCTGTCC TTGGCTTCCC ACAGCCCCAA CGCAGTCTGT 4260 
GTATGATTCC TGGGATCCAA CGAGCCCTCC TATTTTCACA GTGTTCTGAT TGCTCTCACA 4320 
GCCCAGGCCC ATCGTCTGTT CTCTGAATGC AGCCCTGTTC TCAACAACAG GGAGGTCATG 43 80 
GAACCCCTCT GTGGAACCCA CAAGGGGAGA AATGGGTGAT AAAGAATCCA GTTCCTCAAA 4440 
ACCTTCCCTG GCAGGCTGGG TCCCTCTCCT GCTGGGTGGT GCTTTCTCTT GCACACCACT 4500 
CCCACCACGG GGGGAGAGCC AGCAACCCAA CCAGACAGCT CAGGTTGTGC ATCTGATGGA 4 5 SO 
AACCACTGGG CTCAAACACG TGCTTTATTC TCCTGTTTAT TTTTGCTGTT ACTTTGAAGC 4620 
ATGGAAATTC TTGTTTGGGG GATCTTGGGG CTACAGTAGT GGGTAAACAA ATGCCCACCG 4 6 80 
GCCAAGAGGC CATTAACAAA TCGTCCTTGT CCTC-AGGGGC CCCAGCTTGC TCGGGCGTGG 4 740 
CACAGTGGGG AATCCAAGGG TCACAGTATG GGGAGAGGTG CACCCTGCCA CCTGCTAACT 4800 
TCTCGCTAGA CACAGTGTTT CTGCCCAGGT GACCTGTTCA GCAGCAGAAC AAGCCAGGGC 4 8 SO 
CATGGGGACG GGGGAAGTTT TCACTTGGAG ATGGACACCA AGACAATGAA GATTTGTTGT 4 920 
CCAAATAGGT CAATAATTCT GGGAGACTCT TGGAAAAAAC TGAATATATT CAGGACCAAC 4 980 
TCTCTCCCTC CCCTCATCCC ACATCTCAAA GCAGACAATG TAAAGAGAGA ACATCTCACA 5040 
CACCCAGCTC GCCATGCCTA CTCATTCCTG AATTTCAGGT GCCATCACTG CTCTTTCTTT 5100 
CTTCTTTGTC ATTTGAGAAA GGATGCAGGA GGACAATTCC CACAGATAAT CTGAGGAATG 5160 
CAGAAAAACC AGGGCAGGAC AGTTATCGAC AATGCATTAG AAC7TGGTGA GCATCCTCTG 5220 
TAGAGGGACT CCACCCCTGC TCAACAGCTT GGCTTCCAGG CAAGACCAAC CACATCTGGT 5280 
CTCTGCCTTC GGTGGCCCAC ACACCTAAGC GTCATCGTCA TTGCCATAGC ATCATGATGC 5340 
AACACATCTA CGTGTAGCAC TACGACGTTA TGTTTGGGTA ATGTGGGGAT GAACTGCATG 5400 
AGGCTCTGAT TAAGGATGTG GGGAAGTGGG CTGCGGTCAC TGTCGGCCTT GCAAGGCCAC 5460 
CTGGAGGCCT GTCTGTTAGC CAGTGGTGGA GGAGCAAGGC TTCAGGAAGG GCCAGCCACA 5520 
TGCCATCTTC CCTGCGATCA GGCAAAAAAG TGGAATTAAA AAGTCAAACC TTTATATGCA 5580 
TGTGTTATGT CCATTTTGCA GGATGAACTG AGTTTAAAAG AATTTTTTTT TCTCTTCAAG 5640 
TTGCTTTGTC TTTTCCATCC TCATCACAAG CCCTTGTTTG AC-TGTCTTAT CCCTGAGCAA 5700 
TCTTTCGATG GATGGAGATG ATCATTAGGT ACTTTTGTTT CAACCTTTAT TCCTGTAAAT 57S0 
ATTTCTGTGA AAACTAGGAG AACAGAGATG AGATTTGACA AAAAAAAATT GAATTAAAAA 5820 
TAACACAGTC TTTTTAAAAC TAACATAGGA AAGCCTTTCC TATTATTTCT CTTCTTAGCT 5 880 
TCTCCATTGT CTAAATCAGG AAAACAGGAA AACACAGCTT TCTAGCAGCT GCAAAATGGT 5940 
TTAATGCCCC CTACATATTT CCATCACCTT GAACAATAGC TTTAGCTTGG GAATCTGAGA 6000 
TATGATCCCA GAAAACATCT GTCTCTACTT CGGCTGCAAA ACCCATGGTT TAAATCTATA 6060 
TGGTTTGTGC ATTTTCTCAA CTAAAAATAG AGATGATAAT CCGAATTCTC CATATATTCA 6120 
CTAATCAAAG ACACTATTTT CATACTAGAT TCCTGAGACA AATACTCACT GAAGGGCTTG 6180 
TTTAAAAATA AATTGTGTTT TGGTCTGTTC TTGTAGATAA TGCCCTTCTA TTTTAGGTAG 6240 
AAGCTCTGGA ATCCCTTTAT TGTGCTGTTG CTCTTATCTG CAAGGTC-GCA AGCAGTTCTT 63 00 
TTCAGCAGAT TTTGCCCACT ATTCCTCTGA GCTGAAGTTC TTTGCATAGA TTTGGCTTAA 6360 
GCTTGAATTA GATCCCTGCA AAGGCTTGCT CTGTGATGTC AGATGTAATT GTAAATGTCA 6420 
GTAATCACTT CATGAATGCT AAATGAGAAT GTAAGTATTT TTAAATGTGT GTATTTCAAA 6480 
TTTGTTTGAC TAATTCTGGA ATTACAAGAT TTCTATGCAG GATTTACCTT CATCCTGTGC 6540 
ATGTTTCCCA AACTGTGAGG AGGGAAGGCT CAGAGATCGA GCTTCTCCTC TGAGTTCTAA 6600 
CAAAATGGTG CTTTGAGGGT CAGCCTTTAG GAAGGTGCAG CTTTGTTGTC CTTTGAGCTT 6S60 
TCTGTTATGT GCCTATCCTA ATAAACTCTT AAACACATT 
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Seg ID No: 131 Protein sequence: 
Protein Accession #: NPJJ36204.1 

5 

1 11 21 31 41 51 

I I I I I I 

MATSMGLiLLL LLLLLTQPGA GTGADTEAW CVGTACYTAH SGKLSAAEAQ NHC.MQNGGNL 6 0 

ATVKSKEEAQ HVQRVLAQLL RREAALTARM SKFWIGLQRE KGKCLDPSLP LKGFSWVGGG 120 

10 EDTPYSNWHK ELRNSCISKR CVSLLLDLSQ PLLPNRLPKW SEGPCGSPGS PGSNIEGFVC 180 

KFSFKGMCRP LALGGPGQVT YTTPFQTTSS SLEAVPFASA ANVACGEGDK DETQSHYFLC 240 

KEKAPDVFDW GSSGPLCVSP KYGCNFNNGG CHQDCFEGGD GSFLCGCRPG FRLLDDLVTC 300 

ASRNPCSSSP CRGGATCVLG PHGKNYTCRC PQGYQLDSSQ LDCVDVDECQ DSPCAQECVN 360 

TPGGFRCECW VGYEPGGPGE GACQDVDECA LGRSPCAQGC TNTDGSFHCS CEEGYVLAGE 42 0 

15 DGTQCQDVDE CVGPGGPLCD SLCFNTQGSF HCGCLPGWVL APNGVSCTMG PVSLGPPSGP 480 

PDEEDKGEKE GSTVPRAATA SPTRGPEGTP KATPTTSRPS LSSDAPITSA PLKMLAPSGS 54 0 

SGVWREPSIH HATAASGPQE PAGGDSSVAT QNNDGTDGQK LLLFYILGTV VAILLLLALA 6 00 
LGLLVYRKRR AKREEKKEKK PQNAADSYSW VPERAESRAM ENQYSPTPGT DC 

20 

Seq ID NO; 132 DNA sequence 

Nucleic Acid Accession #: NM_000963.1 

Coding sequence: 135-1949 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CAATTGTCAT ACGACTTGCA GTGAGCGTCA GGAGCACGTC CAGGAACTCC TCAGCAGCGC 60 

CTCCTTCAGC TCCACAGCCA GACGCCCTCA GACAGCAAAG CCTACCCCCG CGCCGCGCCC 120 

30 TGCCCGCCGC TCGGATGCTC GCCCGCGCCC TGCTGCTGTG CGCGGTCCTG GCGCTCAGCC 180 

ATACAGCAAA TCCTTGCTGT TCCCACCCAT GTCAAAACCG AGGTGTATGT ATGAGTGTGG 240 

GATTTGACCA GTATAAGTGC GATTGTACCC GGACAGGATT CTATGGAGAA AACTGCTCAA 300 

CACCGGAATT TTTGACAAGA AT AAAAT TAT TTCTGAAACC CACTCCAAAC ACAGTGCACT 360 

ACATACTTAC CCACTTCAAG GGATTTTGGA ACGTTGTGAA TAACATTCCC TTCCTTCGAA 420 

35 ATGCAATTAT GAGTTATGTC TTGACATCCA GATCACATTT GATTGACAGT CCACCAACTT 4 80 

ACAATGCTGA CTATGGCTAC AAAAGCTGGG AAGCCTTCTC TAACCTCTCC TATTATACTA 540 

GAGCCCTTCC TCCTGTGCCT GATGATTGCC CGACTCCCTT GGGTGTCAAA GGTAAAAAGC 600 

AGCTTCCTGA TTCAAAIGAG ATTGTGGAAA AATTGCTTCT AAGAAGAAAG TTCATCCCTG 66 0 

ATCCCCAGGG CTCAAACATG ATGTTTGCAT TCTTTGCCCA GCACTTCACG CATCAGTTTT 72 0 

40 TCAAGACAGA TCATAAGCGA GGGCCAGCTT TCACCAACGG GCTGGGCCAT GGGGTGGACT 780 

TAAATCATAT TTACGGTGAA ACTCTGGCTA GACAGCGTAA ACTC-CGCCTT TTCAAGGATG 840 

GAAAAATGAA AT AT C AG AT A ATTGATGGAG AGATGTATCC TCCCACAGTC AAAGATACTC 900 

AGGCAGAGAT GATCTACCCT CCTCAAGTCC CTGAGCATCT ACGGTTTGCT GTGGGGCAGG 960 

AGGTCTTTGG TCTGGTGCCT GGTCTGATGA TGTATGCCAC AATCTGGCTG CGGGAACACA 1020 

45 ACAGAGTATG CGATGTGCTT AAACAGGAGC ATCCTGAATG GGGTGATGAG CAGTTGTTCC 10 80 

AGACAAGCAG GCTAATACTG ATAGGAGAGA CTATTAAGAT TGTGATTGAA GATTATGTGC 1140 

AACACTTGAG TGGCTATCAC TTCAAACTGA AATTTGACCC AGAACTACTT TTCAACAAAC 12 00 

AATTCCAGTA CCAAAATCGT ATTGCTGCTG AATTTAACAC CCTCTATCAC TGGCATCCCC 1260 

TTCTGCCTGA CACCTTTCAA ATTCATGACC AGAAATACAA CTATCAACAG TTTATCTACA 1320 

50 ACAACTCTAT ATTGCTGGAA CATGGAATTA CCCAGTTTGT TGAATCATTC ACCAGGCAAA 13 80 

TTGCTGGCAG GGTTGCTGGT GGTAGGAATG TTCCACCCGC AGTACAGAAA GTATCACAGG 1440 

CTTCCATTGA CCAGAGCAGG CAGATGAAAT ACCAGTCTTT TAATGAGTAC CGCAAACGCT 1500 

TTATGCTGAA GCCCTATGAA TCATTTGAAG AACTTACAGG AGAAAAGGAA ATGTCTGCAG 1560 

AGTTGGAAGC ACTCTATGGT GACATCGATG CTGTGGAGCT GTATCCTGCC CTTCTGGTAG 1620 

55 AAAAGCCTCG GCCAGATGCC ATCTTTGGTG AAACCATGGT AGAAGTTGGA GCACCATTCT 1680 

CCTTGAAAGG ACTTATGGGT AATGT TAT AT GTTCTCCTGC CTACTGGAAG CCAAGCACTT 1740 

TTGGTGGAGA AGTGGGTTTT CAAATCATCA ACACTGCCTC AATTCAGTCT CTCATCTGCA 18 00 

ATAACGTGAA GGGCTGTCCC TTTACTTCAT TCAGTGTTCC AGATCCAGAG CTCATTAAAA 1860 

CAGTCACCAT CAATGCAAGT TCTTCCCGCT CCGGACTAGA TGATATCAAT CCCACAGTAC 1920 

60 TACTAAAAGA ACGTTCGACT GAACTGTAGA AGTCTAATGA TCATATTTAT TTATTTATAT 1980 

GAACCATGTC TATTAATTTA ATTATTTAAT AATATTTATA TTAAACTCCT TATGTTACTT 2 040 

AACATCTTCT GTAACAGAAG TCAGTACTCC TGTTGCGGAG AAAGGAGTCA TACTTGTGAA 2100 

GACTTTTATG TCACTACTCT AAAGATTTTG CTGTTGCTGT TAAGTTTGGA AAACAGTTTT 21S0 

TATTCTGTTT TATAAACCAG AGAGAAATGA GTTTTGACGT CTTTTTACTT GAATTTCAAC 2220 

65 TTATATTATA AGAACGAAAG TAAAGATGTT TGAATACTTA AACACTATCA CAAGATGGCA 2280 

AAATGCTGAA AGTTTTTACA CTGTCGATGT TTCCAATGCA TCTTCCATGA TGCATTAGAA 2340 

GTAACTAATG TTTGAAATTT TAAAGTACTT TTGGTTATTT TTCTGTCATC AAACAAAAAC 2400 

AGGTATCAGT GCATTATTAA ATGAATATTT AAATTAGACA TTACCAGTAA TTTCATGTCT 2460 

ACTTTTTAAA ATCAGCAATG AAACAATAAT TTGAAATTTC TAAATTCATA GGGTAGAATC 2520 

70 ACCTGTAAAA GCTTGTTTGA TTTCTTAAAG TTATTAAACT TGTACATATA CCAAAAAGAA 2580 

GCTGTCTTGG ATTTAAATCT GTAAAATCAG ATGAAATTTT ACTACAATTG CTTGTTAAAA 2640 

TATTTTATAA GTGATGTTCC TTTTTCACCA AGAGTATAAA CCTTTTTAGT GTGACTGTTA 2700 

AAACTTCCTT TTAAATCAAA ATGCCAAATT TATTAAGGTG GTGGAGCCAC TGCAGTGTTA 2 760 

TCTCAAAATA AGAATATTTT GTTGAGATAT TCCAGAATTT GTTTATATGG CTGGTAACAT 2820 

75 GTAAAATCTA TATCAGCAAA AGGGTCTACC T TT AAAAT AA GCAATAACAA AGAAGAAAAC 2880 

CAAATTATTG TTCAAATTTA GGTTTAAACT TTTGAAGCAA ACTTTTTTTT ATCCTTGTGC 2940 
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ACTGCAGGCC TGGTACTCAG ATTTTGCTAT GAGGTTAATG AAGTACCAAG CTGTGCTTGA 3000 

ATAACGATAT GTTTTCTCAG ATTTTCTGTT GTACAGTTTA ATTTAGCAGT CCATATCACA 30S0 

TTGCAAAAGT AGCAATGACC TCATAAAATA CCTCTTCAAA ATGCTTAAAT TCATTTCACA 3120 

CATTAATTTT ATCTCAGTCT TGAAGCCAAT TCAGTAG3TG CATTGGAATC AAGCCTGGCT 3180 

ACCTGCATGC TGTTCCTTTT CTTTTCTTCT TTTAGCCATT TTGCTAAGAG ACACAGTCTT 3240 

CTCATCACTT CGTTTCTCCT ATTTTGTTTT ACTAGTTTTA AGATCAGAGT TCACTTTCTT 3300 

TGGACTCTGC CTATATTTTC TTACCTGAAC TTTTGCAAGT TTTCAGGTAA ACCTCAGCTC 33 SO 

AGGACTGCTA TTTAGCTCCT CTTAAGAAGA TTAAAAGAGA AAAAAAAAGG CCCTTTTAAA 3420 

AATAGTATAC ACTTATTTTA AGTGAAAAGC AGAGAATTTT ATTTATAGCT AATTTTAGCT 34 30 

ATCTGTAACC AAGATGGATG CAAAGAGGCT AGTGCCTCAG AGAGAACTGT ACGGGGTTTG 3540 

TGACTGGAAA AAGTTACGTT CCCATTCTAA TTAATGCCCT TTCTTATTTA AAAACAAAAC 3600 

CAAATGATAT CTAAGTAGTT CTCAGCAATA ATAATAATGA CGATAATACT TCTTTTCCAC 36S0 

ATCTCATTGT CACTGACATT TAATGGTACT GTATATTACT TAATTTATTG AAGATTATTA 3 720 

TTTATGTCTT ATTAGGACAC TATGGTTATA AACTGTGTTT AAGCCTACAA TCATTGATTT 3 780 

TTTTTTGTTA TGTCACAATC AGTATATTTT CTTTGGGGTT ACCTCTCTGA ATATTATGTA 3840 

AACAATCCAA AGAAATGATT GTATTAAGAT TTGTGAATAA ATTTTTAGAA ATCTGATTGG 3900 

CATATTGAGA TATTTAAGGT TGAATGTTTG TCCTTAGGAT AGGCCTATGT GCTAGCCCAC 3 9 SO 

AAAGAATATT GTCTCATTAG CCTGAATGTG CCATAAGACT GACCTTTTAA AATGTTTTGA 4020 

GGGATCTGTG GATGCTTCGT TAATTTGTTC AGCCACAATT TATTGAGAAA ATATTCTGTG 40 80 

TCAAGCACTG TGGGTTTTAA TATTTTTAAA TCAAACGCTG ATTACAGATA ATAGTATTTA 4140 

TATAAATAAT TGAAAAAAAT TTTCTTTTGG GAAGAGGGAG AAAATGAAAT AAATATCATT 4200 

AAAGATAACT CAGGAGAATC TTCTTTACAA TTTTACGTTT AGAATGTTTA AGGTTAAGAA 4 2 SO 

AGAAATAGTC AATATGCTTG TATAAAACAC TGTTCACTGT TTTTTTTAAA AAAAAAACTT 4320 

GATTTGTTAT TAACATTGAT CTGCTGACAA AACCTGGGAA TTTGGGTTGT GTATGCGAAT 4330 

GTTTCAGTGC CTCAGACAAA TGTGTATTTA ACTTATGTAA AAGATAAGTC TGGAAATAAA 4440 
TGTCTGTTTA TTTTTGTACT ATTTA 

Seq ID No: 133 Protein sequence: 
Protein Accession #: NP_000954.1 
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MIARALLLCA VLALSHTANP CCSHPCQNRG VCMSVGFDQY KCDCTRTGFY GENCSTPEFL 60 

TRIKLFLKPT PNTVHYILTH FKGFWNWNN IPFLRNAIMS YVLTSRSHLI DSPPTYNADY 120 

GYKSWEAFSN LSYYTRALPP VPDDCPTPLG VKGKKQLPDS NEIVEKLLLR RKFIPDPQGS 18 0 

NMMFAFFAQH FTHQFFKTDH KRGPAFTNGL GHGVDLNHIY GETLARQRKL RLFKDGKMKY 24 0 

QIIDGEMYPP TVKDTQAEMI YPPQVPEHLR FAVGQEVFGL VPGLMMYATI WLREHNRVCD 3 00 

VLKQEHPEWG DEQLFQTSRL ILIGETIKIV IEDYVQHLSG YHFKLKFDPE LLFNKQFQYQ 360 

NRIAAEFNTL YHWHPLLPDT FQIHDQKYNY QQFIYNNSIL LEHGITQFVE SFTRQIAGRV 420 

AGGRNVPPAV QKVSQASIDQ SRQMECYQSFN EYRKRFMLKP YESFEELTGE KEMSA3LEAL 480 

YGDIDAVELY PALLVEKPRP DAIFGETMVE VGAPFSLKGL MGNVICSPAY WKPSTFGGEV 54 0 

GFQIINTASI QSLICNNVKG CPFTSFSVPD PELIKTVTIN ASSSRSGLDD INPTVLLKER 600 
STEL I 

Seq ID NO: 134 DNA sequence 

Nucleic Acid Accession #: XM_059648.1 

Coding sequence: 35-664 (underlined sequences correspond to start and stop codons) 



AGGCTGCTGA GACTTCCCTC TAGAATCCTC CAACATGGAG CCTCTTGCAG CTTACCCGCT 6 0 

AAAATGTTCC GGGCCCAGAG CAAAGGTATT TGCAGTTTTG CTGTCTATAG TTCTATGCAC 12 0 

AGTAACGCTA TTTCTTCTAC AACTAAAATT CCTCAAACCT AAAATCAACA GCTTTTATGC 180 

CTTTGAAGTG AAGGATGCAA AAGGAAGAAC TGTTTCTCTG G AAAAG TATA AAGGCAAAGT 24 0 

TTCACTAGTT GTAAACGTGG CCAGTGACTG CCAACTCACA GACAGAAATT ACTTAGGGCT 300 

GAAGGAACTG CACAAAGAGT TTGGACCATC CCACTTCAGC GTGTTGGCTT TTCCCTGCAA 360 

TCAGTTTGGA GAATCGGAGC CCCGCCCAAG CAAGGAAGTA GAATCTTTTG CAAGAAAAAA 420 

CTACGGAGTA ACTTTCCCCA TCTTCCACAA GATTAAGATT CTAGGATCTG AAGGAGAACC 480 

TGCATTTAGA TTTCTTGTTG ATTCTTCAAA GAAGGAACCA AGGTGGAATT TTTGGAAGTA 54 0 

TCTTGTCAAC CCTGAGGGTC AAGTTGTGAA GTTCTGGAAG CCAGAGGAGC CCATTGAAGT 600 

CATCAGGCCT GACATAGCAG CTCTGGTTAG ACAAGTGATC ATAAAAAAGA AAGAGGATCT 660 

ATGAGAATGC CATTGCGTTT CTAATAGAAC AGAGAAATGT CTCCATGAGG GTTTGGTCTC 720 

ATTTTAAACA TTTTTTTTTT GGAGACAGTG TCTCACTCTG TCACCCAGGC TGGAGTGCAG 780 

TAGTGCGTTC TCAGCTCATT GCAACCTCTG CCTTTTTAAA CATGCTATTA AATGTGGCAA 840 

TGAAGGATTT TTTTTTAATG TTATCTTGCT ATTAAGTGGT AATGAATGTT CCCAGGATGA 90 0 

GGATGTTACC CAAAGCAAAA ATCAAGAGTA GCCAAAGAAT CAACATGAAA TATATTAACT 960 

ACTTCCTCTG ACCATACTAA AGAATTCAGA ATACACAGTG ACCAATGTGC CTCAATATCT 1020 

TATTGTTCAA CTTGACATTT TCTAGGACTG TACTTGATGA AAATGCCAAC ACACTAGACC 1080 

ACTCTTTGGA TTCAAGAGCA CTGTGTATGA CTGAAATTTC TGGAATAACT GTAAATGGTT 1140 

ATGTTAATGG AATAAAACAC AAATGTTGAA AAATGTAAAA TATATATACA TAGATTCAAA 120 0 

TCCTTATATA TGTATGCTTG TTTTGTGTAC AGGATTTTGT TTTTTCTTTT TAAGTACAGG 1260 

TTCCTAGTGT TTTACTATAA CTGTCACTAT GTATGTAACT GACATATATA AATAGTCATT 1320 
TATAAATGAC CGTATTATAA CA 
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: XP_059648.1 



21 



31 



51 



I I I I I I 

MEPLAAYPLK CSGPRAKVFA VLLSIVLCTV TLFLLQLKFL KPKINSFYAF BVKDAKGRTV 
SLEKYKGKVS LWNVASDCQ LTDRNYLGLK ELHKEFGPSH FSVLAFPCNQ FGESEPRPSK 
EVESFARKNY GVTFPIFHKI KILGSEGEPA FRFLVDSSKK EPRKNFWKYL VNPEGQWKF 
WKPEEPIEVI RPDIAAIiVRQ VIIKKKEDL 



I ID NO: 136 DNA 



CAAGTGCCGT 
CGGTCGAGCT 
AGCGCCCCTC 
CTCCTTGAGG 
TACCATTCTT 
ATCATGGTGC 
ATGGCTGCCT 
GACACTGTGA 
AAGCTGGATG 
TTTGTCCAGA 
GAAACGTTTT 
AATGAAGATT 
TTTGAAAGTA 
AAGGAAATCA 
CGTTGGAGTC 
CAAGCAGCGT 
GATGCCCTCA 
GATGCCGACC 
ATTAGACTTC 
ATTCTTCGGT 
CAGTCTTTGA 
CCTCAGGTCC 
CCCCTCTACG 
GAGGAAGCCC 
GAGAATACAA 
GGGCTGAACA 



I 

CGCCGCGCCC 
ACGGTCGCGG 
GCCACCCCGT 
ATATTCAGTT 
AATGATTATC 
AAAAAT AC C A 
ATGAAAGGAG 
GTGAATTCAA 
TAGATGCACC 
AAAACTCACT 
CCAATCGGGT 
GGACCTGTTT 
CAGTGGAAAA 
TCGAATACTA 
CGCCTTCCAT 
CCATGGCCGT 
GCAGCCCCAG 
ACATCAAGAG 
GCCAGTGGCT 
TCCTCCGTGC 
CGTGGAGAAA 
TTCAGGATTA 
TGCTCAGGCT 
TGCTGAGATA 
AAGTCTTTGG 



CTTCCCCCTC 
ACGAGTGGAA 
GTCCCAGGCC 
TTGTATGTTT 



GTCCCCAGTG 
GTTCCCTACA 
GAGCGAAGAT 
CAGACTGCTG 
GAATTCTCGG 
CATCATTAAT 
TGAACAGTCT 
AATTGCAATG 
CCTTCGCCAA 
CACGCCCTCT 
CGTCATCCCA 
TGCACCTGAG 
ATACCTGGGC 
CCAGGAGACC 
ACGGGATTTT 
GCAGCATCAG 
CTACGCGGGA 



31 
I 

' CCGCCTCCCC 
CCGAGACTGC 
CGGCCTTTCT 
GAATATCCTC 
CAGGTGTGAG 
AGAGTGTACA 
TGTCCTTTGA 



AAGAAGATTG 
GAACGTACTT 
GAGCATTGCT 
GCAAGTTTAG 
AAACAATATA 
TTAGAAGAAG 



I 

GGCCCCCTCC 
CCCC-CGGAGC 
GACAAGAGCT 
TCACCATGTT 
AGGGTTGCTG 
AATACCCCTT 
TTCCGATGTT 
ATGTCATTGA 
CAGGAGTTGA 
TGCACATTGA 



GAAGCTGCCC 
CCCGTGGTC-G 
GATTTGACTC 
CACAAGGGCA 
AATATTGACA 
GTAGACTACA 



GTATTTCCTG 
TTCCTCATTT 
AAAGAGATTA 
CTGGTCCCCA 
TGGACTGAGA 
ATTCAGATTG 
ATTGTGTTTA 



CAGCTGGGCC 
GTGCAGGGTT 
AGCATGCCTG 
CAGGTCTCTT 
TTCAGAGGTT 
GCCACCACCT 
GCCTGCACCT 
CACCCAGCGG 
GGTAAACGTA 
ACTTAACTCA 
GGGCTCTGTT 



GACATCCTCC 
AAGCTGCCTC 
GTCTGTGGAC 



GGGGGCTCAG 
AAACATTACT 
TTCTGTATGT 
CACCTGCAGT 
TTGTCTCAGA 
ATTTGCCACT 



CCAACTACCC 
TGCTCTGGAC 
ATGCAGGAAA 
TTCCAGATTT 
AATCTCTGTA 
CCATCTACCA 
TGGATGCCTC 
ACATCTATCA 
GCATCACCTC 
GCGACTACAG 
CCCATGTGAC 
CGTGCGCCGC 
CGCACAAGTG 
CCATGACGAG 
CCTCCAGCCA 
AGTGTGCAGA 
CGACATTGTA 
GTCGTTTGAT 
ATAGCCATAG 
GAAAGAAAAG 
AAAATTTTTC 
AGAGATGGCC 
ATGGCCCGCA 
TTAGGGCCAG 
TTGTCCCAGG 
GAGGGGCTCT 
TTCTCTTTCC 
GAACTTGGGT 
CAGCTCCCAG 
GCCCAGACAG 
TGACACTGTC 



CGTTCTCTCC 
TCGGCCTATC 
GTGGAGACCT 
TGAGACACTG 
GCTGGTTAGT 
TGACTACCAG 
CCTGAGTGGG 



GACACCAAAG 
GTAAATGAAG 
AGCTCATGGA 
GGTGTGAAAG 



ATATTAAATC 
CCAGCAACAT 
AAGGCATAAC 
CTTCATCATC 
TCAAGGAGGG 
GCACCCCTGA 
CGCTGCAGGA 
AAATTCCAAA 
A^GCCAGAGA 
TTCTTGAAAC 
ATCACGACAA 
GCTTGGTGAG 



CGCCGGTATG 
AGACTTCGGG 
CAGCATAAAG 
TTGCATTGCA 
TGAATTAATT 
CGTGGGCAGT 
AAGGCGCTGC 
TTATGTTTAT 
GGCTTATAAT 
TCACCCTGAA 

TAAAAAAGGA 
CTTTGTGCCC 
CTCCAAGAAA 
GCTGAGTGGT 
CGACAAACTA 
GAGCTGCCTC 
AGATGAGCAT 
GATCATGTGT 



CCGTTCATTG 
GGTCCTGGAG 
GAGTGCATGT 



CCTGCCTGGT 
CGCTGCTGCG 
TCATCCTGCG 
ATGACAACAC 
GCCTGCTGGA 



AGATGGGCGG 
AGCGCTCGGG 
GCGATGCGAA 
GGACTTGGAA 
GATCATCGAG 
GGCGCCCAGG 
CAGAAGGAAG 
TTACATCGAC 



GTCTGCAAGC 
GTCAGTCATC 
CTCCAAGAGG 

CATGGTGGAG 



CAGCAGCCTT 
TAAAGTGATG 
CCTGGAGTCC 
GTCCCACTCC 



GTCTTCAAAG 
ACTTGGGATT 
TCGCCACAAC 
AACAATGTGC 
TCGCCTCTGA 
GGCTTCTACA 
CCCCGGGTGG 



CCCAAAACTA 
ATTTTGTATA 
TAGTTTCTGT 
CAACGAACTC 
CCTCCTCACC 
CGCCGCCTCA 
CCCTTGAGGT 
GCCAGACACA 
CAGGGACTCC 
TCCTTTTCAA 
GGGGGGGTTC 



AGCCACAGCG 
AGCTCCATGA 
GCCCCTCCTC 
TCACCTCTAG 
CCTTGGCAGG 
CGTTGTGCAC 



CGCATTGTCC 
TGGGACGGAA 
CGGCCCCCAT 
CCTTATCCTC 
CCCACACCAC 



ATCTTTTTGA 
TTCCCGTTTC 
GCCATCTCCT 
TAGGAGGCCG 
TATTAGTAGC 



AGAACGAAGA CCTGAAGCTC 
GAGCCCCACA TGAGATTCTC 
TCGACGTGTG CAAAGGGGAC 
CACCCAAAAA GGACTCCCTG 
AGCTCATAGA CAAAGTCTGG 
TCTGCAAAGA AGGAGAAAGC 
TCCTGCAGTG GAAATTCCAC 
ACGACGTGCT TGCGTCCCTG 
AGGTGATCGG CTCGGAGGAT 
GCTTCTCCCA GCTGAGTGCC 
TCTCCAGGTA-GTGCCGCGCT 
GGACAGCAGC TGCACCCGCC 
ATAGCAAATA GCTCTCAGAT 
TAGTTTTAAC TCTGATCCTA 
AAAATCCAAC CAGAGCGCAA 
GGATTGACGT GGTCTCAGAT 
ATTAGTGAAT GAATTCCTGT 
GCTGCCAGCT CGCTTCCCCC 
GCTTCCCGCC AGTCAAGATG 
TGAGGATTCA GAGGTTGCCT 
CCACTGTCTG CAGTGGGGCC 
AGGAAAATGC TGCCATCGTT 
TACTTTTTAG AGCAGGATTT 
CTTCCGTGCG TCGCCCCTCT 
CTGTGCCCTC TGGAGGCTCA 
TCTTGGAACC AGCAAGTCGC 
TAAGCAGCAG CTCTCGCATC 



1380 
1440 
1500 
1550 
1620 
1680 
1740 
1800 
I860 
1920 
1980 
2040 
2100 
2160 
2220 

2340 
2400 
2460 
2520 
2580 
264C 
2700 
2760 
2820 
2880 

3000 
3060 
312 0 
3180 
3240 
3300 
33S0 ' 
3420 
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CACTTCAGGG TGGCGTGTGG CATGTAGGAG 
CTCATGCGTG TGTGTGTGTG CATGTGCTGT 
GCTGGGGGGA CGGGGTGAGT GGAAACTTAG 
AAATCAGAAT ATGGGATTTG TTTGCCTTTT 
TGCTCTATCT GGTACAGGCC CTTATTTTTT 
AGAATCTGTC CAGAAGTTGC ATAGGGGATG 
TTCTGTGTGC AGCAGAGGCC GTGTTTTTCA 
GCGTGGTAGG CATGGAGATC CTGGTTGTGC 
GGGTGCTGCG TGACTGGAGA GCTGTGTGGA 
GGGCGGGGGA GGGACCGAGC AGCCCTCTTG 
CACACTGTAG ACGTCCCAGG GCCTGTGCTG 
TTGCTCTTAG AGATCGAGCT CCTCAGTGGT 
TAGGGTTCGT AGGTAGGGCT AGTAGGTAGG 
GTAGGTAGGG TTAGTAGGTA GGGTTCGTAG 
GGCTAGTAGG TAGGGTTCGT AGGTAGGGCT 
GGTAGGGCTA GTAGGTAGGG TTAGTAGGTA 
TAGTAGGTAG GGCTAGTAGG TAGGGTTCGT 
AGGGCTAGTA GGTAGGGCTA GTAGGTAGGG 
TAGGTAGGGT TAGTAGGTAG GGCTAGTAGG 
GTTAGTAGGT AGGGCTAGTA GGTAGGGCTA 
GTAGGGCTGG TAGGTAGGGT TAGTAGGTAG 
AGTAGGTAGG GCTAGTAGGT AGGGCTAGTA 
GGGTTCGTAG GTAGGGTTCG TAGGTAGGGT 
TGCTTCCACC TGGTGCTTCC TGTTCCCAAA 
CTCTTTCTCT TTCTCTGTGT CTCAGATGGC 
TCACTCAACA GTCCTCATGT G C C C AG AG AT 
CCCTGCCCCC TCCCAGGCTG AAGATCTGTT 
TTTATACCCA AAGACTGTAG TGCATCTTGA 
TTACAGGGTT TCCTCCCGAG TAATCCAATC 
GCTATGGTTT GAGTATGCAG TTTGCATCGT 
TAAAACGCTG CTGTCATTTC CCATTTCTTA 
TATGTCTTAA TTCACTTTCC TTCCTAAATT 
TTTGTAAACA TATTACCTCA CTTGGTAATA 
ATTGTTATCA ATAATAAATG TGAACTATTT 

Seq ID No: 137 Protein sequence: 
Protein Accession #: NP_002994.1 



TCCTGCTTCT TTC-TACATGG GAATTGTGGA 3480 

GTGTGTGCAT GTGTGCATGA CGGTGGGGGT 3540 

TTTGAGTAAT GAAGGAATCT TCACAGAAGC 35 00 

ACATTTTGTT TAATTCCTGA TTTTAAAGCC 3650 

CAGCTTTTTA TGGGAAAAGC AGGTTATTTG 3720 

GCCTCCACGA TAAGGACATG CAACACGTGT 3 780 

TGCCAAACCC CACGCGGCTG TCAACTGTGT 3 840 

CGTCTCAGCT CCGCTCTGAA GGCACTGTGT 3900 

GGCCATGTGT GCCCCGTGCA GGGATCAGGA 3950 

CCCGGTCGGG TCAGCCCTAG TGGCTGCCTG 4020 

TGATCACCTG CCTTTGGACC ACATTTGTGT 4 030 

ACCTGAAGCC TTTGCTTCCG GAAAGCGCGG 4140 

GTTAGTAGGT AGGGCTAGTA GGTAGGGCTA 4200 

GTAGGGCTGG TAGGTAGGGT TAGTAGGTAG 4260 

AGTAGGTAGG GTTAGTAGGT AGGGCTAGTA 4320 

GGGTTCGTAG GTAGGGCTGG TAGGTAGGGT 43 80 

AGGTAGGGCT AGTAGGTAGG GTTAGTAGGT 4440 

TTAGTAGGTA GGGTTCGTAG GTAGGGCTGG 4500 

TAGGGCTAGT AGGTAGGGCT AGTAGGTAGG 4560 

GTAGGTAGGG TTAGTAGGTA GGGTTCGTAG 4620 

GGCTAGTAGG TAGGGCTAGT AGGTAGGGCT 4680 

GGTAGGGCTA GTAGGTAGGG CTAGTAGGTA 4740 

TCGTAGGTAG GGTTAGTAGC GCGTCTGTGC 4800 

TCACAAGGGC CTGAAGGTGG TCCCTGCTTT 4 860 

GATTTTGCTG ACAGCTGCCA AGAAAATGCT 4 920 

GTTTATAGAA CTGTTTGAAT TGCAGCCATC 4 980 

CTTTTTAAGT TGATTCGGGA GTGGCATTCT 5040 

AGAGCTCAAA GCACATGACC GCACAAATGC 5100 

TCACTCCCCT TGTAAGGGAA TTCTGGGGCA 5160 

GTTTCTACCT TTAGTACCTT GCCACTCTTT 5220 

GTACTAATGA TTCTTTGATT CTCCCTCTAT 5280 

TGTTATTTGC ATATCAAATT CTGTAAATGT 5340 

CAATACTGAT AGTCTTTAAA AGATTTTTTT 5400 
AAAG 



1 11 21 31 41 51 

I I I I I I 

MVQKYQSPVR VYKYPFELIM AAYERRFPTC PLIPMFVGSD TVSEFKSEDG AIHVIERRCK SO 

LDVDAPRLLK KIAGVDYVYF VQKNSLNSRE RTLHIEAYNE TFSNRVIINE HCCYTVHPEN 120 

EDWTCFEQSA SLDIKSFFGF ESTVEKIAMK QYTSNIKKGK EIIEYYLRQL EEEGITFVPR 180 

WSPPSITPSS ETSSSSSKKQ AASMAWIPE AALKEGLSGD ALSSPSAPEP WGTPDDKLD 240 

ADHIKRYLGD LTPLQESCLI RLROWLQETH KGKIPKDEHI LRFLRARDFN IDKAREIMCQ 300 

SLTWRKQHQV DYILETWTPP QVLQDYYAGG WHHHDKDGRP LYVLRLGQMD TKGLVRALGE 360 

EALLRYVLSV NEERLRRCEE NTKVFGRPIS SWTCLVDLEG LNMRHLWRPG VKALLRIIEV 420 

VEANYPETLG RLLILRAPRV FPVLWTLVSP FIDDNTRRKF LIYAGNDYQG PGGLLDYIDK 480 

EIIPDFLSGE CMCEVPEGGL VPKSLYRTAE ELENEDLKLW TETIYQSASV FKGAPHEILI 540 

QIVDASSVIT WDFDVCKGDI VFNIYHSKRS PQPPKKDSLG AHSITSPGGN NVQLIDKVWQ 500 

LGRDYSMVES PLICKEGESV QGSHVTRWPG FYILQWKFHS MPACAASSLP RVDDVLASLQ 650 
VSSHKCKVMY YTEVIGSEDF RGSMTSLESS HSGFSQLSAA TTSSSQSHSS SMISR 

Nucleic Acid Accession #: NM_004181.1 

Coding sequence: 32-670 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

GCAGAAATAG CCTAGGGAGA TCAACCCCGA 
GGTCGCCGGC CAGTGGCGCT TCGTGGACGT 
GGTGCCAGCG CCTGCCTGCG CGCTGCTGCT 
CTTCAGGAAA AAGCAGATTG AAGAGCTGAA 
CATGAAGCAG ACCATTGGGA ATTCCTGTGG 
TAATCAAGAC AAACTGGGAT TTGAGGATGG 
AGAGAAAATG TCCCCTGAAG ACAGAGCAAA 
AGCCCATGAT GCCGTGGCAC AGGAAGGCCA 
TTTTATTCTG TTTAACAACG TGGATGGCCA 
TCCGGTGAAC CATGGCGCCA GTTCAGAGGA 
CAGAGAATTC ACCGAGCGTG AGCAAGGAGA 
GGCAGCCTAA TGCTCTGTGG GAGGGACTTT 
AATATATACC CCCCATGCAG TCTAAAATGC 
TGTTCTGCAG ACACGCCTTC CCCTCAGCCA 
ACAGCTGTCC ACTGGGCCAT TGTGGTGTGA 



31 41 51 

I I I 

GATGCTGAAC AAAGTGCTGT CCCGGCTGGG 60 

GCTGGGGCTG GAAGAGGAGT CTCTGGGCTC 120 

GCTGTTTCCC CTCACGGCCC AGCATGAGAA 180 

GGGACAAGAA GTTAGTCCTA AAGTGTACTT 240 

CACAATCGGA CTTATTCACG CAGTGGCCAA 300 

ATCAGTTCTG AAACAGTTTC TTTCTGAAAC 360 

ATGCTTTGAA AAGAATGAGG CCATACAGGC 420 

ATGTCGGGTA C-ATGACAAGG TGAATTTCCA 480 

CCTCTATGAA CTTGATGGAC GAATGCCTTT 540 

CACCCTGCTG AAGGACGCTG CCAAGGTGTG 600 

AGTCCGCTTC TCTGCCGTGG CTCTCTGCAA 660 

GCTGATTTCC CCTCTTCCCT TCAACATGAA 720 

TTCAGTACTT GTGAAACACA GCTGTTCTTC 780 

CACCCAGGCA CTTAAGCACA AGCAGAGTGC 840 

GCTTCAGATG GTGAAGCATT CTCCCCAGTG 900 
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I I I I I I 

MIjNKVLSRLG vagqwrfvdv lgleeeslgs vpapacalll lfpltaqhen frkkqieelk so 

GQEVSPKVYF MKQTIGNSCG TIGLIHAVAN NQDKLGF3DG SVLKQFLSET EKMSPEDRAK 120 
CFEKNEAIQA AHDAVAQEGQ CRVDDKVNFH FILFNNVDGH LYELDGRMPF PWHGASSED 180 
TLLKDAAKVC REFTEREQGE VRFSAVALCK AA 

Seq ID NO: 140 DNA sequence 

Nucleic Acid Accession #: NM_000201.1 

Coding sequence; 58-1656 (underlined sequences correspond to start and stop codons) 



I 

GCGCCCCAGT 
GCTCCCAGCA 
CCAGGACCTG 
GGCTCCGTGC 
ACCCCGTTGC 
AGCAATGTGC 
ACAGCTAAAA 
CCCTCTTGGC 
CCCCGGGCCA 



GCCAATTTCT 
AACACCTCGG 
GTCAGCCCCC 
CTGTTCCCAG 
ACAGTCACCT 
GACGAGGGCA 
CTGCAGACAG 
GTCTCAGAAG 
CTGAATGGGG 
CCAGAGGACA 
ATACACAAGA 



CGACGCTGAG C 
GCCCCCGGCC C 
GCAATGCCCA G 
TGGTGACATG C 
CTAAAAAGGA G 
AAGAAGATAG C 
CCTTCCTCAC C 
AGCCAGTGGG C 
ACCTCACCGT G 
AGCCCGCTGA G 
CGTGCCGCAC 1 
CCCCCTACCA G 
GGGTCCTAGA G 
TCTCGGAGGC C 
ATGGCAACGA C 
CCCAGCGGCT G 
TGACCATCTA C 



: GCACTCCTGG 
3 TCCCCCTCAA 
: TGTGACCAGC 
3 CCTGGGAACA 
3 TGCTATTCAA 



GCAACCTCAG 
TCCTGCTCGG 
A^GTCATCCT 
CCAAGTTGTT 
ACCGGAAGGT 
ACTGCCCTGA 



CCTCGCTATG 
GGCTCTGTTC 
GCCCCGGGGA 



P ACCCTACGCT 



TGGGGGAACC 
GGGGAATCAG 
ACTCAAGGGG 
GTCATCATCA 
CTCTATAACC 
CCCATGAAAC 
CCTCGGCCTT 
CATGCAGCTA 
TACAACAGCA 
CTGTAGTCAC 
TTAAAGTCTA 
ATACAACTGG 
CAGAAGAAGT 
GACGGATGCC 
TTCATTTGTT 
GGTCTCTGGC 
TACAGGTTGT 
CATTGGCCAA 
ATGGACTGGT 
CCCCCAAAAC 
CACAATGACA 
GCCTTGTCCT 
TCCTGCAGTG 
CCTCCCAGCT 
CGCTCTGTCA 
TTTGGGCTCA 
ACACCACACC 
TGCCCAGACT 



TTCCAGCCCA 
ACGGGCGCAG 
ACCAGACCCG 
GAAACTGGAC 
CATTGCCCGA 
TGACTGTCAC 
AGGTCACCCG 
CTGTGGTAGC 
GCCAGCGGAA 
CGAACACACA 
CCCATATTGG 
CACCTACCGG 
TTTGGGGCCA 
ATGACTAAGC 
GCCTGATGAG 
GAAATACTGA 
GGCCCTCCAT 
AGCTTGGGCA 
ATTTTACCAG 
CTCACGGAGC 
ACACTGCAGG 
CCTGCCTTTC 



GACAGTGAAG 
GCCACTGGGC 
CTTCTCCTGC 
GGAGCTTCGT 
GTGGCCAGAA 
GCTCAAGTGT 
TCGAGATCTT 
CGAGGTGACC 
AGCCGCAGTC 



ACGGTGCTGG 
CTGCGGCCCC 
TTTGTCCTGC 
CAGGGGACCG 
CTGGCACTGG 
GCCAAGGCCT 
GTAATACTGG 
GCGCCCAACG 
TGTGAGGCCC 
CCGAGGGCCC 
TCTGCAACCC 
GTCCTGTATG 
AATTCCCAGC 
CTAAAGGATG 
GAGGGCACCT 
GTGAATGTGC 



GCCAGGTGGA 
AGGAGCTGAA 
TGAGGAGAGA 



CAGCGACTCC 
TGGTCTGTTC 
GGGACCAGAG 



GTATGAACTG 
TGGGCAGTCA 
GGCACCCCTC 
GGGTGGGGCA 
ACGGGAGCCA 
TCACCATGGA 
GCTGTTTGAG 
CCCACAACTT 



TGGCAGTGGT 
CCCTGGGACG 
TGGTACCTGC 
CAAGAGGAAG 



TACAGACTAC 
CCCTGAACCT 
GCCACACTGA 



GGAACCAGAG 
TGATTCTGAC 
ACCCTAGAGC 
AGCTCCTGCT 
TGGAGGTGGC 
GCCCCCGACT 
AGACTCCAAT 
GCACTTTCCC 
ACCTCTGTCG 
TCTCCCCCCG 
CTGCAGGCCT 
AACAGGCCCA 



GTTGAACCCC 
GACCGCAGAG 
CCAGGAGACA 
GAAGCCAGAG 
CAAGGIGACG 
GAAGGCCACC 



TGACACCTTT 
CTCAGCGGTC 
CTTGTCCTGT 
ATCAGGGTCC 
TTGGAAGGGT 
CCCAGGCTGG 
AGTGATCCTC 
TGGCAAATTT 
TCCTTTGTGT 



AACTTGCTGC 
AGACATGTGT 
CTGCTGTCTA 
CTATTTATTG 
TCCCAGTCCA 
AGAGTGCCTG 
CCCAGAAGGA 
AGGTTCAGAG 
GTTAGCCACC 
ATGTCTGGAC 
TTGCATTTCA 
TGCAAGCAGT 
CATCCGCGTG 



CTATTGGGTA 
AGCATCAAAA 
CTGACCCCAA 
AGTGTCTTTT 



GTGATTTTTC 
ATTACCCAGT 
TCCCCACCCA 
ATGAGTGCCC 
CTGGGAGCTT 



ACAGAGTGGA 
GGGCATTGTC 
CACTAGGCCA 
CAAGACATGA 
CATAGCCCCA 
TGCTGAGGCC 
CACAAAGGCC 
CCCTTGATGA 
ATGTAGGCTA 
AAGGTCACCA 
CAAATGGGGC 
TATCGGCACA 
GAGGCCTTAT 
CATACATTTC 



3GACGAGAGG 
GTGCCAGGCT 
ACTGCCCATC 
GGCCAGGAGC 
GTATGAGATT 
CAGCACGTAC 
AAAAGGGACC 
AGGGCCTCTT 
AGACATATGC 
CTCAGTCAGA 
CGCATCTGAT 
TTGATGGATG 
CCATGAGGAC 
CACAGACTTA 
CACACTTCCT 
TATGTATTTA 
AATGAACATA 
GGTACAGTTG 
TGGGACTTCT 
AAAGCACTAT 



CCACCTCAGC 
GATTTTTTTT 
TAGTTAATAA 



TGTGTGTGTG 
TGCAATCATG 
CTCCTGAGTA 
TTTTTTTTCA 
AGCTTTCTCA 



GTTCACIGCA 
GCTGGGACCA 
GAGACGGGGT 
ACTGCC 



TGCCAGTGTT 
CCCAAGCTAT 
AGCTCCAGTT 
TGGAGGACTC 
GACAAGCTCT 
GTCTTGACCT 
TAGGCTCACA 



1200 
1250 
1320 
1380 
1440 
1500 
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1620 
1S80 

1800 
1850 
1920 



2100 
2160 
2220 
2280 
2340 
2400 
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1 11 21 31 41 51 

I I I I I I 

MLQFVRAGAR AWLRPTGSQG LSSLAEEAAR ATENPEQVAS EGLPEPVLRK VELPVPTHRR 60 
PVQAWVESLR GFEQERVGLA DLHPDVFATA PRLDILHQVA MWQKNFKRIS YAKTKTRAEV 120 
RGGGGKPLAA ERHWAGPAWQ HPLSALARRR CCPWPPGPTS YYYMLPMKVR ALGLKVALTV 180 
KLAQDDLHIM DSLELPTGDP QYLTELAHYR RWGDSVLLVD LTHEEMPQSI VEATSRLKTF 240 
NLIPAVGLNV HSMLKHQTLV LTLPTVAFLE DKLLWQDSRY RPLYPFSLPY SDFPRPLPHA 3 00 
TQGPAATPYH C 

Seq ID NO: 142 DNA sequence 

Nucleic Acid Accession #: NM_000270.1 

Coding sequence: 110-379 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

AACTGTGCGA ACCAGACCCG GCAGCCTTGC TCAGTTCAGC ATAGCGGAGC GGATCCGATC SO 

GGATCGGAGC ACACCGGAGC AGGCTCATCG AGAAGGCGTC TGCGAGACCA_TGGAGAACGG 120 

ATACACCTAT GAAGATTATA AGAACACTGC AGAATGGCTT CTGTCTCATA CTAAGCACCG 180 

ACCTCAAGTT GCAATAATCT GTGGTTCTGG ATTAGGAGGT CTGACTGATA AATTAACTCA 240 

GGCCCAGATC TTTGACTACA GTGAAATCCC CAACTTTCCT CGAAGTACAG TGCCAGGTCA 3 00 

TGCTGGCCGA CTGGTGTTTG GGTTCCTGAA TGGCAGGGCC TGTG7GATGA TGCAGGGCAG 360 

GTTCCACATG TATGAAGGGT ACCCACTCTG GAAGGTGACA TTCCCAGTGA GGGTTTTCCA 420 

CCTTCTGGGT GTGGACACCC TGGTAGTCAC CAATGCAGCA GGAGGGCTGA ACCCCAAGTT 480 

TGAGGTTGGA GATATCATGC TGAT C CGTGA CCATATCAAC CTACCTGGTT TCAGTGGTCA 540 

GAACCCTCTC AGAGGGCCCA ATGATGAAAG GTTTGGAGAT CGTTTCCCTG CCATGTCTGA 600 

TGCCTACGAC CGGACTATGA GGCAGAGGGC TCTCAGTACC TGGAAACAAA TGGGGGAGCA 660 

ACGTGAGCTA CAGGAAGGCA CCTATGTGAT GGTGGCAGGC CCCAGCTTTG AGACTGTGGC 720 

AGAATGTCGT GTGCTGCAGA AGCTGGGAGC AGACGCTGTT GGCATGAGTA CAGTACCAGA 780 

AGTTATCGTT GCACGGCACT GTGGACTTCG AGTCTTTGGC TTCTCACTCA TCACTAACAA 840 

GGTCATCATG GATTATGAAA GCCTGGAGAA GGCCAACCAT GAAGAAGTCT TAGCAGCTGG 900 

CAAACAAGCT GCACAGAAAT TGGAACAGTT TGTCTCCATT CTTATGGCCA GCATTCCACT 960 

CCCTGACAAA GCCAGT TGA C CTGCCTTGGA GTCGTCTGGC ATCTCCCACA CAAGACCCAA 1020 

GTAGCTGCTA CCTTCTTTGG CCCCTTGCTG GAGTCATGTG CCTCTGTCCT TAGGTTGTAG 1080 

CAGAAAGGAA AAGATTCCTG TCCTTCACCT TTCCCACTTT CTTCTACCAG ACCCTTCTGG 1140 

TGCCAGATCC TCTTCTCAAA GCTGGGATTA CAGGTGTGAG CATAGTGAGA CCTTGGCGCT 12 00 

ACAAAATAAA GCTGTTCTCA TTCCTGTTCT TTCTTACACA AGAGCTGGAG CCCGTGCCCT 12 SO 

ACCACACATC TGTGGAGATG CCCAGGATTT GACTCGGGCC TTAGAACTTT GCATAGCAGC 1320 

TGCTACTAGC TCTTTGAGAT AATACATTCC GAGGGGCTCA GTTCTGCCT? ATCTAAATCA 1380 
CCAGAGACCA AACAAGGACT AATCCAATAC CTCTTGGA 



1 11 21 31 41 51 

I I I I > I 

MENGYTYEDY KNTAEWLLSH TKHRPQVAI I CGSGLGGLTD KLTQAQIFDY SEIPNFPRST 60 
VPGHAGRLVF GFLNGRACVM MQGRFHMYEG YPLWKVTFPV RVFHLLGVDT LWTNAAGGL 120 
NPKFEVGDIM LIRDHINLPG FSGQNPLRGP NDERFGDRFP AMSDAYDRTM RQRALSTWKQ 180 
MGEQRELQEG TYVMVAGPSF ETVAECRVLQ KLGADAVGMS TVPEVIVARH CGLRVFGFSL 240 
ITNKVIMDYE SLEKANHEEV LAAGKQAAQK LEQFVSILMA SIPLPDKAS 

Seq ID NO: 144 DNA sequence 

Nucleic Acid Accession #: NM_015577.1 

Coding sequence: 112-3054 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

GAAGCGGCGG GCGGGGTGGA GCAGCCAGCT 
GGGGTGTTGA AAAGTCTCCT CTAGAGCTTT 
TTGAAAGCGA AGTTCAGGAA GAGTGACACC 
CTGCAGGCCG TGGAGAATGG AGATGCGGAG 
GCCAGTGCCA CCAAACACGA CAGTGAGGGC 
GGACACGTGG AATGCCTCAG GGTCATGATT 
ACTACCGGAC ACAGCGCCTT ACATCTCGCA 
AGGCTGCTTC AGTCTAAATG CCCAGCCGAA 
CATTATGCAG CGGCTCAGGG CTGCCTTCAA 
CCCATAAACC TCAAAGATTT GGATGGGAAT 
CACAGTGAGA TCTGTCACTT TCTCCTGGAT 
AGTGGAAGAA CTGCTCTCAT GCTGGCCTGT 
TTAATTAAAA AGGGTGCAGA CCTAAACCTT 
TATTCCAAAC TCTCAGAAAA TGCAGGAATT 
GATGCTGATT TAAAGACCCC AACAAAACCA 



31 41 51 

I I I 

GGGTCCGGGG AGCGCCGCCG CCGCCTCGAT 60 

GGAAGGCTGA ATGCACTAAA CATGAAGAGC 120 

AATGAGTGGA ACAAGAATGA TGACCGGCTA 180 

AAGGTGGCCT CACTGCTCGG CAAGAAGGGG 240 

AAGACCGCTT TCCATCTTGC TGCTGCAAAA 300 

ACACATGGTG TGGATGTGAC AGCCCAAGAT 360 

GCCAAGAACA GCCACCATGA ATGCATCAGG 420 

AGTGTCGACA GCTCTGGGAA AACAGCTTTA 480 

GCTGTGCAGA TTCTCTGCGA AC ACAAGAG C 540 

ATACCGCTGC TTCTTGCTGT ACAAAATGGT 500 

CATGGAGCAG ATGTCAATTC CAGGAACAAA 660 

GAGATTGGCA GCTCTAACGC TGTGGAAGCC 720 

GTAGATTCTC TTGGATACAA TGCCTTACAT 780 

CAAAGCCTTC TATTATCAAA AATCTCTCAG 840 

AAGCAGCATG ACCAAGTCTC TAAAATAAGC 900 
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TCAGAAAGAA GTGGAACTCC AAAAACACGC AAAGCTCCAC CACCTCCTAT CAGTCCTACC 9S0 

CAGTTGAGTG ATGTCTCTTC CCCAAGATCA ATAACTTCGA CTCCACTATC GGGAAAGGAA 1020 

TCGGTATTTT TTGCTGAACC ACCCTTCAAG GCTGAGATCA GTTCTATACG AGAAAACAAA, 1080 

GACAGACTAA GTGACAGTAC TACAGGTGCT GATAGCTTAT TGGATATAAG TTCTGAAGCT 1140 

5 GACCAACAAG ATCTTCTCTC TCTATTGCAA GCAAAAGTTG CTTCCCTTAC CTTACACAAT 12 00 

AAGGAGTTAC AAGATAAATT ACAGGCCAAA TCACCCAAGG AC-GCGGAAGC AGACCTAAGC 12 SO 

TTTGACTCAT ACCATTCCAC CCAAACTGAC TTGGGCCCAT CCCTGGGAAA ACCTGGTGAA 1320 

ACCTCTCCCC CAGACTCCAA ATCATCTCCA TCTGTCTTAA TACATTCTTT AGGTAAATCC 1380 

ACTACTGACA ATGATGTCAG AATTCAGCAA CTGCAAGAGA TTTTGCAAGA TCTACAGAAG 1440 

10 AGATTAGAGA GCTCTGAAGC AGAGAGAAAA CAGCTACAGG TCGAACTCCA ATCCCGAAGG 1500 

GCAGAACTGG TATGCTTAAA CAACACTGAG ATTTCAGAGA ACAGCTCTGA CCTCAGCCAG 1560 

AAACTTAAAG AAACTCAGAG CAAATACGAG GAGGCTATGA AAGAAGTCCT TAGTGTGCAG 1620 

AAGCAGATGA AACTCGGTCT TGTCTCACCT GAAAGCATGG ATAATTATTC ACATTTCCAC 1630 

GAGCTGAGGG TCACGGAAGA GGAAATAAAT GTGCTAAAGC AGGATCTGCA GAATGCATTA 1740 

15 GAAGAAAGTG AAAGAAATAA AGAGAAAGTG AGAGAGTTAG AGGAAAAACT GGTAGAGAGG 1800 

GAGAAAGGTA CAGTGATTAA GCCACCTGTG GAAGAGTACG AGGAAATGAA AAGTTCATAT 1860 

TGCTCTGTTA TTGAGAATAT GAATAAGGAG AAAGCATTTT TGTTTGAGAA ATACCAAGAA 1920 

GCCCAAGAAG AAATCATGAA ATTAAAAGAC ACACTAAAAA GTCAGATGAC ACAGGAAGCC 1980 

AGTGATGAAG CTGAGGACAT GAAAGAAGCC ATGAATAGGA TGATAGATGA ACTCAATAAA 2040 

20 CAGGTGAGCG AGCTGTCACA GCTGTACAAA GAAGCCCAGG CTGAGCTGGA GGATTACAGG 2100 

AAGAGGAAAT CTCTAGAGGA TGTCACAGCT GAATATATCC ATAAAGCAGA GCATGAGAAA 2 ISO 

CTGATGCAAT TGACAAACGT GTCCAGGGCT AAAGCAGAAG ATGCACTGTC TGAAATGAAG 2220 

TCTCAGTATT CAAAAGTGTT GAATGAGTTG ACCCAGCTCA AACAACTGGT GGATGCACAA 2280 

AAAGAGAACT CTGTCTCTAT CACAGAACAT TTGCAAGTGA TAACCACGCT GCGGACTGCA 2340 

25 GCAAAAGAGA TGGAAGAAAA AATAAGCAAT CTTAAGGAAC ACCTTGCAAG CAAGGAAGTG 24 00 

G AAGT AG CAA AGCTGGAGAA ACAACTCTTA GAAGAGAAAG CTGCTATGAC TGATGCAATG 2460 

GTACCTCGGT CTTCCTATGA AAAACTCCAG TCATCCTTAG AGAGTGAAGT GAGTGTGTTG 2520 

GCATCGAAAT TAAAGGAATC TGTGAAAGAG AAAGAGAAGG TCCATTCAGA GGTTGTCCAG 25 80 

ATTAGAAGTG AGGTCTCACA GGTGAAAAGA GAAAAGGAAA ATATTCAGAC TCTCTTGAAA 2640 

30 TCCAAAGAGC AAGAAGTAAA TGAACTTCTG CAAAAATTCC AGCAAGCTCA GGAAGAACTT 2700 

GCAGAAATGA AAAGATACGC TGAGAGCTCT TCAAAACTGG AGGAAGATAA AGATAAAAAG 2760 

ATAAATGAGA TGTCGAAGGA AGTCACCAAA TTGAAGGAGG CCTTGAACAG CCTCTCCCAG 2 820 

CTCTCCTACT CAACAAGCTC ATCCAAAAGG CAGAGTCAGC AGCTGGAGGC GCTGCAGCAG 2 880 

CAAGTCAAAC AGCTCCAGAA CCAGCTGGCG GAATGCAAGA AACAACACCA GGAGGTCATA 2 940 

35 TCAGTTTACA GAATGCATCT TCTGTATGCT GTGCAGGGCC AGATGGATGA AGATGTCCAG 3000 

AAAGTACTGA AGCAAATCCT TACCATGTGT AAAAACCAGT CTCAAAAGAA GTAAAGTGGA 3060 

TTCCTTGGCA GGACACTGCC CCTTGTCATC TGTCTTTGTG TTAGATCCAG AGTTGTCGGC 3120 

AGCCGCTGCC ATTGTTCTCA TTCGTGGTAT GCACTGTGGC CTAGCGTAGC TTCTTCCCTT 3180 

TCCAAAGGTT TCTGAGGACT TCTCCCAGGA GAAGACTGCC CGCCTCAGAA CTGCTTAGAG 3240 

40 ACTTCAAACC AGCAGAGGTG AAAGTCCCTG TCATCCCTTC AGATTCCAGA GCTGGGATCA 3300 

GCCATGCCCA GAGGTCTGGT CCTGATGCTG GCAGGGGGGC CCCCTCCTCC ATCCCTGACT 3360 

GGCTGAGTGG CTTTATCACC ACCGAGTGAT GTGCTGAGGC CTCCTGCAGT GAATGCTCCT 342 0 

TCCATTCCTG TACTCGGGCA GTGCCATTCA GCACAGGAGA GCTCTTTTTG CCTTTGGCTT 34 80 

TCAATTCCAA AACATGATTT AATTTCTAAC TAAATTAGTA TGGCACTAGT TATGAAGTAT 3540 

45 CTGCTTAAAA CCCTTCATCA TGATATCCTG TGGATTTAAA AACTCTAATT CCATGTTTTC 3 600 

TTCCCATCTG CCTTATATAT CTCATCACCC TGCTTATCAA TATTCAGTTT GATGAGCACT 3660 

ATTAACTAAA ATATGAAACT TAAAAACAAA AGCAAGTTGT CCTTAAAAGT TCTTTTTTTA 3720 

AGTAAATTGT TGACATACTG CAAATTTTCT ATGCAAACTT GCCTCCTGCT GTTATCTGTG 3780 

AAGCTCAGGA AATCCAAACA TTTGTGTTTC AACAAGGGAC AGTAAACTGT GTGTTTACAG 3 84 0 

50 CCAAAAGAAA TGCCTCATAG TTCTTAACCT CAACTTTTGT AGAAGTATTT TTTTCTCTGT 3 900 

AATATTTTTA TTGGCTCATA AAGATGTTTT CATATCTGAA CTCCTAAATA AGTGAAATTA 3 960 

CAGTAGATTA TA1TAACAAA ATACTTTTTA GGTAGCCATG CTTGAGACTT TTTAAAAATA 4020 

TAACTTTTTC CTTAAAGTTT TCAGCTATAG CAAAAGGTAG TTATGTATGC CAG AC CTAAT 4 0 80 

ATGAGCTGCC ACCAACACCC CTAGAACTTT CAGCCATGGT GTCTTCAGAA TTGTAGCGCA 414 0 

55 TTTCTGAATC TAGCAAATCC TCCTTTTACC CGTTGAATGT TTTGAATGCC CTGACTCTAC 4200 

CAGCGCCCAT AAATGATCTC TAGAAGGACT GTTAGTACCA ATCTGTTTTT CAACTTTGAA 4260 

GCTAAAAACC CTGATATGGT AATATTATGG TGCATAGCAG AGGTCTCGGA AAAAAAATAT 4320 

TTCTGTTCAC TTTACTTTCA GGTTAAAAAT GTTTCTAACA CGCTTGCAAC TTCCCTTATG 4380 

GCATTAATCT TGTTGAGGGA GAGAGACAGA ATCCTGGACT CTCCAAAGTA TTTAACTGAA 444 0 

60 AGTAGGGCCT GCTCTGACAG GGCCCATGTC CCACAAGGCT GCTTGGCCTC AGTGGGTGCT 4500 

TGGCTGTGCT GGATGATATG TTGATCTGTA TTGGATAAGG ACCAATGACA GCAAAGCAAA 4560 

AATGGCTTTA AAGCTTGGTG TTACTTTTCT TAAGTTGTTT AATTATAGTT AAGCAATTTC 4620 

AAAAATGCTC CAAAGAAATG TGAAAGGACC TTTTGTCACA GCACTTCAGA AAATACACAA 4680 

CAGCCCCTTC TGCCCCCGCA CAGAAATGCT GCAGAGTATA TAAAACTTGA GACATTTTTG 4740 

65 TAGGATGCCT GACGAGGTGT AGCCTTTTAT CTTGTTTCCG GATGCATATT TATTACGAGT 4800 

ACTCTGGTTA AATATTGAAA AGTTATATGC TGTAGTTTTT AGTATTTTGT CTTTGTAATT 4 860 

TACAGAAGTT ATTGGAGAAA ATAAACTTGT TTCATTTTGC AAAAAAAAAA AAAAAAAAAA 4 92 0 
AAAAA 

70 Seq ID No: 145 Protein sequence: 
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AAKGHVECLR VMITHGVDVT AQDTTGHSAL HLAAKNSHHE CIRRLLQSKC PAESVDSSGK 120 

TALHYAAAQG CLQAVQILCE HKSPINLKDL DGNIPLLLAV QNGHSEICHF LLDHGADVNS 180 

RNKSGRTALM LACEIGSSNA VEALIKKGAD LNLTOSLGYN ALHYSKLSEN AGIQSLLLSK 240 

ISQDADLKTP TKPKQHDQVS KISSERSGTP KTRKAPPPPI SPTQLSDVSS PRSITSTPLS 300 

GKESVFFAEP PFKAEISSIR ENKDRLSDST TGADSLLDIS SEADQQDLLS LLQAKVASLT 3 50 

LHNKELQDKL QAKSPKEAEA DLSFDSYHST QTDLGPSLGK PGETSPPDSK SSPSVLIHSL 420 

GKSTTDNDVR IQQLQEILQD LQKRLESSEA ERKQLQVELQ SRRAELVCLN NTEISENSSD 4 80 

LSQKLKETQS KYEEAMKEVL SVQKQMKLGL VSPESMDNYS HEHELRVTEE EINVLKQDLQ 540 

NALEESERNK EKVRELEEKL VEREKGTVIK PPVEEYEEMK SSYCSVIENM NKEKAFLFEK 600 

YQEAQEEIMK LKDTLKSQMT QEASDEAEDM KEAMNRMIDE LNKQVSELSQ LYKEAQAELE 650 

DYRKRKSLED VTAEYIHKAE HEKLMQLTNV SRAKAEDA1S EMKSQYSKVL NELTQLKQLV 720 

DAQKENSVSI TEHLQVITTL RTAAKEMEEK ISNLKEHLAS KEVEVAKLEK QLLEEKAAMT 780 

DAMVPRSSYE KLQSSLESEV SVLASKLKES VKEKEKVHSE WQIRSSVSQ VKREKENIQT 840 

LLKSKEQEVN ELLQKFQQAQ EELAEMKRYA ESSSKLEEDK DKKINEMSKE VTKLKEALNS 900 

LSQLSYSTSS SKRQSQQLEA LQQQVKQLQN QLAECKKQHQ EVISVYRMHL LYAVQGQMDE 960 
DVQKVLKQIL TMCKNQSQKK 

Seq ID NO: 146 DNA sequence 

Nucleic Acid Accession #: NM_0004S9.1 

Coding sequence: 149-3523 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CTTCTGTGCT GTTCCTTCTT GCCTCTAACT TGTAAACAAG ACGTACTAGG ACGATGCTAA 60 

TGGAAAGTCA CAAACCGCTG GGTTTTTGAA AGGATCCTTG GGACCTCATG CACATTTGTG 120 

GAAACTGGAT GGAGAGATTT GGGGAAGC AT G GACTCTTTA GCCAGCTTAG TTCTCTGTGG 180 

AGTCAGCTTG CTCCTTTCTG GAACTGTGGA AGGTGCCATG GACTTGATCT TGATCAATTC 240 

CCTACCTCTT GTATCTGATG CTGAAACATC TCTCACCTGC ATTGCCTCTG GGTGGCGCCC 300 

CCATGAGCCC AT C ACC AT AG GAAGGGACTT TGAAGCCTTA ATGAACCAGC ACCAGGATCC 3 60 

GCTGGAAGTT ACTCAAGATG TGACCAGAGA ATGGGCTAAA AAAGTTGTTT GGAAGAGAGA 420 

AAAGGCTAGT AAGATCAATG GTGCTTATTT CTGTGAAGGG CGAGTTCGAG GAGAGGCAAT 4 80 

CAGGATACGA ACCATGAAGA TGCGTCAACA AGCTTCCTTC CTACCAGCTA CTTTAACTAT 540 

GACTGTGGAC AAGGGAGATA ACGTGAACAT ATCTTTCAAA AAGGTATTGA TTAAAGAAGA 600 

AGATGCAGTG ATTTACAAAA ATGGTTCCTT CATCCATTCA GTGCCCCGGC ATGAAGTACC 660 

TGATATTCTA GAAGTACACC TGCCTCATGC TCAGCCCCAG GATGCTGGAG TGTACTCGGC 720 

CAGGTATATA GGAGGAAACC TCTTCACCTC GGCCTTCACC AGGCTGATAG TCCGGAGATG 780 

TGAAGCCCAG AAGTGGGGAC CTGAATGCAA CCATCTCTGT ACTGCTTGTA TGAACAATGG 840 

TGTCTGCCAT GAAGATACTG GAGAATGCAT TTGCCCTCCT GGGTTTATGG GAAGGACGTG 900 

TGAGAAGGCT TGTGAACTGC ACACGTTTGG CAGAACTTGT AAAGAAAGGT GCAGTGGACA 960 

AGAGGGATGC AAGTCTTATG TGTTCTGTCT CCCTGACCCC TATGGGTGTT CCTGTGCCAC 1020 

AGGCTGGAAG GGTCTGCAGT GCAATGAAGC ATGCCACCCT GGTTTTTACG GGCCAGATTG 1080 

TAAGCTTAGG TGCAGCTGCA ACAATGGGGA GATGTGTGAT CGCTTCCAAG GATGTCTCTG 1140 

CTCTCCAGGA TGGCAGGGGC TCCAGTGTGA GAGAGAAGGC ATACCGAGGA TGACCCCAAA 1200 

GATAGTGGAT TTGCCAGATC ATATAGAAGT AAACAGTGGT AAATTTAATC CCATTTGCAA 1260 

AGCTTCTGGC TGGCCGCTAC CTACTAATGA AGAAATGACC CTGGTGAAGC CGGATGGGAC 1320 

AGTGCTCCAT CCAAAAGACT TTAACCATAC GGATCATTTC TCAGTAGCCA TATTCACCAT 13 80 

CCACCGGATC CTCCCCCCTG ACTCAGGAGT TTGGGTCTGC AGTGTGAACA CAGTGGCTGG 1440 

GATGGTGGAA AAGCCCTTCA ACATTTCTGT TAAAGTTCTT CCAAAGCCCC TGAATGCCCC 1500 

AAACGTGATT GACACTGGAC ATAACTTTGC TGTCATCAAC ATCAGCTCTG AGCCTTACTT 1560 

TGGGGATGGA CCAATCAAAT CCAAGAAGCT TCTATACAAA CCCGTTAATC ACTATGAGGC 1620 

TTGGCAACAT ATTCAAGTGA CAAATGAGAT TGTTACACTC AACTATTTGG AACCTCGGAC 1630 

AGAATATGAA CTCTGTGTGC AACTGGTCCG TCGTGGAGAG GGTGGGGAAG GGCATCCTGG 1740 

ACCTGTGAGA CGCTTCACAA CAGCTTCTAT CGGACTCCCT CCTCCAAGAG GTCTAAATCT 1800 

CCTGCCTAAA AGTCAGACCA CTCTAAATTT GACCTGGCAA CCAATATTTC CAAGCTCGGA 1860 

AGATGACTTT TATGTTGAAG TGGAGAGAAG GTCTGTGCAA AAAAGTGATC AGCAGAATAT 1920 

TAAAGTTCCA GGCAACTTGA CTTCGGTGCT ACTTAACAAC TTACATCCCA GGGAGCAGTA 1980 

CGTGGTCCGA GCTAGAGTCA ACACCAAGGC CCAGGGGGAA TGGAGTGAAG ATCTCACTGC 2040 

TTGGACCCTT AGTGACATTC TTCCTCCTCA ACCAGAAAAC ATCAAGATTT CCAACATTAC 2100 

ACACTCCTCG GCTGTGATTT CTTGGACAAT ATTGGATGGC TATTCTATTT CTTCTATTAC 2160 

TATCCGTTAC AAGGTTCAAG GCAAGAATGA AGACCAGCAC GTTGATGTGA AGATAAAGAA 2220 

TGCCACCATC ATTCAGTATC AGCTCAAGGG CCTAGAGCCT GAAACAGCAT ACCAGGTGGA 22 80 

CATTTTTGCA GAGAACAACA TAGGGTCAAG CAACCCAGCC TTTTCTCATG AACTGGTGAC 2340 

CCTCCCAGAA TCTCAAGCAC CAGCGGACCT CGGAGGGGGG AAGATGCTGC TTATAGCCAT 2400 

CCTTGGCTCT" GCTGGAATGA CCTGCCTGAC TGTGCTGTTG GCCTTTCTGA TCATATTGCA 2460 

ATTGAAGAGG GCAAATGTGC AAAGGAGAAT GGCCCAAGCC TTCCAAAACG TGAGGGAAGA 2520 

ACCAGCTGTG CAGTTCAACT CAGGGACTCT GGCCCTAAAC AGGAAGGTCA AAAACAACCC 2580 

AGATCCTACA ATTTATCCAG TGCTTGACTG GAATGACATC AftATTTCAAG ATGTGATTGG 2640 

GGAGGGCAAT TTTGGCCAAG TTCTTAAGGC GCGCATCAAG AAGGATGGGT TACGGATGGA 2700 

TGCTGCCATC AAAAGAATGA AAGAATATGC CTCCAAAGAT GATCACAGGG ACTTTGCAGG 2750 

AGAACTGGAA GTTCTTTGTA AACTTGGACA CCATCCAAAC ATCATCAATC TCTTAGGAGC 2 820 

ATGTGAACAT CGAGGCTACT TGTACCTGGC CATTGAGTAC GCGCCCCATG GAAACCTTCT 28 80 

GGACTTCCTT CGCAAGAGCC GTGTGCTGGA GACGGACCCA GCATTTGCCA TTGCCAATAG 2940 

CACCGCGTCC ACACTGTCCT CCCAGCAGCT CCTTCACTTC GCTGCCGACG TGGCCCGGGG 3000 

CATGGACTAC TTGAGCCAAA AACAGTTTAT CCACAGG3AT CTGGCTGCCA GAAACATTTT 3 060 

AGTTGGTGAA AACTATGTGG CAAAAATAGC AGATTTTGGA TTGTCCCGAG GTCAAGAGGT 3120 
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GTACGTGAAA AAGACAATGG GAAGGCTCCC AGTGCGCTGG ATGGCCATCG AGTCACTGAA 3180 

TTACAGTGTG TACACAACCA ACAGTGATGT ATGGTCCTAT GGTGTGTTAC TATGGGAGAT 3240 

TGTTAGCTTA GGAGGCACAC CCTACTGCGG GATGACTTGT GCAGAACTCT ACGAGAAGCT 3300 

GCCCCAGGGC TACAGACTGG AGAAGCCCCT GAACTGTGAT GATGAGGTGT ATGATCTAAT 3 3 SO 

5 GAGACAATGC TGGCGGGAGA AGCCTTATGA GAGGCCATCA TTTGCCCAGA TATTGGTGTC 3420 

CTTAAACAGA ATGTTAGAGG AGCGAAAGAC CTACGTGAAT ACCACGCTTT ATGAGAAGTT 34 80 

TACTTATGCA GGAATTGACT GTTCTGCTGA AGAAGCGGCC TAGGACAGAA CATCTGTATA 3540 

CCCTCTGTTT CCCTTTCACT GGCATGGGAG ACCCTTGACA ACTGCTGAGA AAACATGCCT 3600 

CTGCCAAAGG ATGTGATATA TAAGTGTACA TATGTGCTGG AATTCTAACA AGTCATAGGT 3660 

10 TAATATTTAA GACACTGAAA AATCTAAGTG ATATAAATCA GATTCTTCTC TCTCATTTTA 3 720 

TCCCTCACCT GTAGCATGCC AGTCCCGTTT CATTTAGTCA TGTGACCACT CTGTCTTGTG 3780 

TTTCCACAGC CTGCAAGTTC AGTCCAGGAT GCTAACATCT AAAAATAGAC TTAAATCTCA 3340 

TTGCTTACAA GCCTAAGAAT CTTTAGAGAA GTATACATAA GTTTAGGATA AAATAATGGG 3 900 

ATTTTCTTTT CTTTTCTCTG GTAATATTGA CTTGTATATT TTAAGAAATA ACAGAAAGCC 3960 

15 TGGGTGACAT TTGGGAGACA TGTGACATTT ATATATTGAA TTAATATCCC TACATGTATT 4020 

GCACATTGTA AAAAGTTTTA GTTTTGATGA GTTGTGAGTT TACCTTGTAT ACTGTAGGCA 4 080 
CACTTTGCAC TGATATATCA TGAGTGAATA AATGTCTTGC CTACTCAAAA AAAAAAAA 

Seq ID Ho: 147 Protein secmence: 
20 Protein Accession #: NP_000450.1 



1 11 21 31 41 51 

I I I I 1 I 

25 MDSLASLVLC GVSLLLSGTV EGAMDLILIN SLPLVSDAET SLTCIASGWR PHEPITIGRD 60 

FEALMNQHQD PLEVTQDVTR EWAKKWWKR EKASKINGAY FCEGRVRGEA IRIRTMKMRQ 120 

QASPLPATLT MTVDKGDWVN ISFKKVLIKE EDA VI YKNGS FIHSVPRHEV PDILEVHLPH 180 

AQPQDAGVYS ARYIGGNLFT SAFTRLIVRR CEAQKWGPEC NHLCTACMNN GVCHEDTGEC 240 

ICPPGFMGRT CEKACELHTF GRTCKERCSG QEGCKSYVFC LPDPYGCSCA TGWKGLQCNE 300 

30 ACHPGFYGPD CKLRCSCNNG EMCDRFQGCL CSPGWQGLQC EREGIPRMTP KIVDLPDHIE 360 

VNSGKFNPIC KASGWPLPTN EEMTLVKPDG TVLHPKDFNH TDHFSVAIFT IHRILPPDSG 420 

VWVCSVNTVA GMVEKPFNIS VKVLPKPLNA PNVIDTGHNF AVINISSEPY FGDGPIKSKK 480 

LLYKPVNHYE AWQHIQVTNE IVTLNYLEPR TEYELCVQLV RRGEGGEGHP GPVRRFTTAS 540 

IGLPPPRGLN LLPKSQTTLN LTWQPIFPSS EDDFYVEVER RSVQKSDQQN IKVPGNLTSV 600 

35 LLNNLHPREQ YWRARVNTK AQGEWSEDLT AWTLSDILPP QPENIKISNI THSSAVISWT 660 

ILDGYSISSI TIRYKVQGKN EDQIWDVKIK NATIIQYQLK GLEPETAYQV DIFAENNIGS 720 

SNPAFSHELV TLPESQAPAD LGGGKMLDIA ILGSAGMTCL TVLLAFLIIL QIiKRAHVQRR 780 

MAQAFQMVRE EPAVQFNSGT LALNRKVKNN PDPTIYPVLD WNDIKFQDVI GEGNFGQVLK 840 

ARIKKDGLRM DAAIKRMKEY ASKDDHRDFA GELEVLCKLG HKPNIIKLLG ACEHRGYLYL 900 

40 AIEYAPHGNL LDFLRKSRVL ETDPAFAIAN STASTLSSQQ LLHFAADVAR GMDYLSQKQF 960 

IHRDLAARNI LVGENYVAKI ADFGLSRGQE VYVKKTKGRL PVRWMAIESL NYSVYTTNSD 1020 

VWSYGVLLWE IVSLGGTPYC GMTCAELYEK IiPQGYRLEKP LNCDDBVYDIi MRQCWREKPY 10S0 
ERPSFAQILV SLNRMLEERK TYVNTTLYEK FTYAGIDCSA EEAA 

45 Seq ID NO: 148 DNA seuuence 

Nucleic Acid Accession #: NM_0005S2.2 

Coding sequence: 311-8752 (underlined sequences correspond to start and stop codons) 



50 1 11 21 31 41 51 

I I I 1 I I 

AGCTCACAGC TATTGTGGTG GGAAAGGGAG GGTGGTIGGT GGATGTCACA GCTTGGGCTT 60 

TATCTCCCCC AGCAGTGGGG ACTCCACAGC CCCTGGGCTA CATAACAGCA AGACAGTCCG 120 

GAGCTGTAGC AGACCTGATT GAGCCTTTGC AGCAGCTGAG AGCATGGCCT AGGGTGGGCG 180 

55 GCACCATTGT CCAGCAGCTG AGTTTCCCAG GGACCTTGGA GATAGCCGCA GCCCTCATTT 240 

GCAGGGGAAG GCACCATTGT CCAGCAGCTG AGTTTCCCAG GGACCTTGGA GATAGCCGCA 300 

GCCCTCATTT ATGA TTCCTG CCAGATTTGC CGGGGTGCTG CTTGCTCTGG CCCTCATTTT 360 

GCCAGGGACC CTTTGTGCAG AAGGAACTCG CGGCAGGTCA TCCACGGCCC GATGCAGCCT 420 

TTTCGGAAGT GACTTCGTCA ACACCTTTGA TGGGAGCATG TACAGCTTTG CGGGATACTG 480 

60 CAGTTACCTC CTGGCAGGGG GCTGCCAGAA ACGCTCCTTC TCGATTATTG GGGACTTCCA 540 

GAATGGCAAG AGAGTGAGCC TCTCCGTGTA TCTTGGGGAA TTTTTTGACA TCCATTTGTT 600 

TGTCAATGGT ACCGTGACAC AGGGGGACCA AAGAGTCTCC ATGCCCTATG CCTCCAAAGG 560 

GCTGTATCTA GAAACTGAGG CTGGGTACTA CAAGCTGTCC GGTGAGGCCT ATGGCTTTGT 720 

GGCCAGGATC GATGGCAGCG GCAACTTTCA AGTCCTGCTG TCAGACAGAT ACTTCAACAA 780 

65 GACCTGCGGG CTGTGTGGCA ACTTTAACAT CTTTGCTGAA GATGACTTTA TGACCCAAGA 840 

AGGGACCTTG ACCTCGGACC CTTATGACTT TGCCAACTCA TGGGCTCTGA GCAGTGGAGA 900 

ACAGTGGTGT GAACGGGCAT CTCCTCCCAG CAGCTCATGC AACATCTCCT CTGGGGAAAT 960 

GCAGAAGGGC CTGTGGGAGC AGTGCCAGCT TCTGAAGAGC ACCTCGGTGT TTGCCCGCTG 1020 

CCACCCTCTG GTGGACCCCG AGCCTTTTGT GGCCCTGTGT GAGAAGACTT TGTGTGAGTG 1080 

70 TGCTGGGGGG CTGGAGTGCG CCTGCCCTGC CCTCCTGGAG TACGCCCGGA CCTGTGCCCA 1140 

GGAGGGAATG GTGCTGTACG GCTGGACCGA CCACAGCGCG TGCAGCCCAG TGTGCCCTGC 1200 

TGGTATGGAG TATAGGCAGT GTGTGTCCCC TTGCGCCAGG ACCTGCCAGA GCCTGCACAT 1260 

CAATGAAATG TGTCAGGAGC GATGCGTGGA TGGCTGCAGC TGCCCTGAGG GACAGCTCCT 1320 

GGATGAAGGC CTCTGCGTGG AGAGCACCGA GTGTCCCTGC GTGCATTCCG GAAAGCGCTA 1380 

75 CCCTCCCGGC ACCTCCCTCT CTCGAGACTG CAACACCTGC ATTTGCCGAA ACAGCCAGTG 1440 

GATCTGCAGC AATGAAGAAT GTCCAGGGGA GTGCCTTGTC ACTGGTCAAT CCCACTTCAA 1500 
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GAGCTTTGAC AACAGATACT TCACCTTCAG TGGGATCTGC CAGTACCTGC 1 

TTGCCAGGAC CACTCCTTCT CCATTGTCAT TGAGACTGTC CAGTGTGCTG ATGACCGCGA 1620 

CGCTGTGTGC ACCCGCTCCG TCACCGTCCG GCTGCCTGGC CTGCACAACA GCCTTGTGAA 1680 

ACTGAAGCAT GGGGCAGGAG TTGCCATGGA TGGCCAGGAC ATCCAGCTCC CCCTCCTGAA 174 0 

AGGTGACCTC CGCATCCAGC ATACAGTGAC GGCCTCCGTG CGCCTCAGCT ACGGGGAGGA 1800 

CCTGCAGATG GACTGGGATG GCCGCGGGAG GCTGCTGGTG AAGCTGTCCC CCGTCTACGC 1860 

CGGGAAGACC TGCGGCCTGT GTGGGAATTA CAATGGCAAC CAGGGCGACG ACTTCCTTAC 1920 

CCCCTCTGGG CTGGCAGAGC CCCGGGTGGA GGACTTCGGG AACGCCTGGA AGCTGCACGG 1980 

GGACTGCCAG GACCTGCAGA AGCAGCACAG CGATCCCTGC GCCCTCAACC CGCGCATGAC 2040 

CAGGTTCTCC GAGGAGGCGT GCGCGGTCCT GACGTCCCCC ACATTCGAGG CCTGCCATCG 210 0 

TGCCGTCAGC CCGCTGCCCT ACCTGCGGAA CTGCCGCTAC GACGTGTGCT CCTGCTCGGA 2160 

CGGCCGCGAG TGCCTGTGCG GCGCCCTGGC CAGCTATGCC GCGGCCTGCG CGGGGAGAGG 2220 

CGTGCGCGTC GCGTGGCGCG AGCCAGGCCG CTGTGAGCTG AACTGCCCGA AAGGCCAGGT 22 8 0 

GTACCTGCAG TGCGGGACCC CCTGCAACCT GACCTGCCGC TCTCTCTCTT ACCCGGATGA 2340 

GGAATGCAAT GAGGCCTGCC TGGAGGGCTG CTTCTGCCCC CCAGGGCTCT ACATGGATGA 2400 

GAGGGGGGAC TGCGTGCCCA AGGCCCAGTG CCCCTGTTAC TATGACGGTG AGATCTTCCA 2460 

GCCAGAAGAC ATCTTCTCAG ACCATCACAC CATGTGCTAC TGTGAGGATG GCTTCATGCA 2520 

CTGTACCATG AGTGGAGTCC CCGGAAGCTT GCTGCCTGAC GCTGTCCTCA GCAGTCCCCT 258 0 

GTCTCATCGC AGCAAAAGGA GCCTATCCTG TCGGCCCCCC ATGGTCAAGC TGGTGTGTCC 264 0 

CGCTGACAAC CTGCGGGCTG AAGGGCTCGA GTGTACCAAA ACGTGCCAGA ACTATGACCT 2700 

GGAGTGCATG AGCATGGGCT GTGTCTCTGG CTGCCTCTGC CCCCCGGGCA TGGTCCGGCA 2760 

GGAGTGCATG AGCATGGGCT GTGTCTCTGG CTGCCTCTGC CCCCCGGGCA TGGTCCGGCA 2760 

TGAGAACAGA TGTGTGGCCC TGGAAAGGTG TCCCTGCTTC CATCAGGGCA AGGAGTATGC 2 820 

CCCTGGAGAA ACAGTGAAGA TTGGCTGCAA CACTTGTGTC TGTCGGGACC GGAAGTGGAA 2 8 80 

CTGCACAGAC CATGTGTGTG ATGCCACGTG CTCCACGATC GGCATGGCCC ACTACCTCAC 2 94 0 

CTTCGACGGG CTCAAATACC TGTTCCCCGG GGAGTGCCAG TACGTTCTGG TGCAGGATTA 3 000 

CTGCGGCAGT AACCCTGGGA CCTTTCGGAT CCTAGTGGGG AATAAGGGAT GCAGCCACCC 306 0 

CTCAGTGAAA TGCAAGAAAC GGGTCACCAT CCTGGTGGAG GGAGGAGAGA TTGAGCTGTT 3120 

TGACGGGGAG GTGAATGTGA AGAGGCCCAT GAAGGATGAG ACTCACTTTG AGGTGGTGGA 3180 

GTCTGGCCGG TACATCATTC TGCTGCTGGG CAAAGCCCTC TCCGTGGTCT GGGACCGCCA 324D 

CCTGAGCATC TCCGTGGTCC TGAAGCAGAC ATACCAGGAG AAAGTGTGTG GCCTGTGTGG 33 00 

GAATTTTGAT GGCATCCAGA ACAATGACCT CACCAGCAGC AACCTCCAAG TGGAGGAAGA 33 60 

CCCTGTGGAC TTTGGGAACT CCTGGAAAGT GAGCTCGCAG TGTGCTGACA CCAGAAAAGT 3420 

GCCTCTGGAC TCATCCCCTG CCACCTGCCA TAACAACATC ATGAAGCAGA CGATGGTGGA 34 80 

TTCCTCCTGT AGAATCCTTA CCAGTGACGT CTTCCAGGAC TGCAACAAGC TGGTGGACCC 3540 

CGAGCCATAT CTGGATGTCT GCATTTACGA CACCTGCTCC TGTGAGTCCA TTGGGGACTG 36 0 0 

CGCCTGCTTC TGCGACACCA TTGCTGCCTA TGCCCACGTG TGTGCCCAGC ATGGCAAGGT 3660 

GGTGACCTGG AGGACGGCCA CATTGTGCCC CCAGAGCTGC GAGGAGAC-GA ATCTCCGGGA 3 72 0 

GAACGGGTAT GAGTGTGAGT GGCGCTATAA CAGCTGTGCA CCTGCCTCTC AAGTCACGTG 3 78 0 

TCAGCACCCT GAGCCACTGG CCTGCCCTGT GCAGTGTGTG GAGGGCTGCC ATGCCCACTG 3 84 0 

CCCTCCAGGG AAAATCCTGG ATGAGCTTTT GCAGACCTGC GTTGACCCTG AAGACTGTCC 3900 

AGTGTGTGAG GTGGCTGGCC ' GGCGTTTTGC CTCAGGAAAG AAAGTCACCT TGAATCCCAG 396 0 

TGACCCTGAG CACTGCCAGA TTTGCCACTG TGATGTTGTC AACCTCACCT GTGAAGCCTG 4020 

CCAGGAGCCG GGAGGCCTGG TGGTGCCTCC CACAGATGCC CCGGTGAGCC CCACCACTCT 4080 

GTATGTGGAG GACATCTCGG AACCGCCGTT GCACGATTTC TACTGCAGCA GGCTACTGGA 414 0 

CCTGGTCTTC CTGCTGGATG GCTCCTCCAG GCTGTCCGAG GCTGAGTTTG AAGTGCTGAA 42 0 0 

GGCCTTTGTG GTGGACATGA TGGAGCGGCT GCGCATCTCC CAGAAGTGGG TCCGCGTGGC 4260 

CGTGGTGGAG TACCACGACG GCTCCCACGC CTACATCGGG CTCAAGGACC GGAAGCGACC 432 0 

GTCAGAGCTG CGGCGCATTG CCAGCCAGGT GAAGTATGCG GGCAGCCAGG TGGCCTCCAC 43 80 

CAGCGAGGTC TTGAAATACA CACTGTTCCA AATCTTCAGC AAGATCGACC GCCCTGAAGC 4440 

CTCCCGCATC GCCCTGCTCC TGATGGCCAG CCAGGAGCCC CAACGGATGT CCCGGAACTT 45 0 0 

TGTCCGCTAC GTCCAGGGCC TGAAGAAGAA GAAGGTCATT GTGATCCCGG TGGGCATTGG 4560 

GCCCCATGCC AACCTCAAGC AGATCCGCCT CATCGAGAAG CAGGCCCCTG AGAACAAGGC 4 62 0 

CTTCGTGCTG AGCAGTGTGG ATGAGCTGGA GCAGCAAAGG GACGAGATCG TTAGCTACCT 4680 

CTGTGACCTT GCCCCTGAAG CCCCTCCTCC TACTCTGCCC CCCCACATGG CACAAGTCAC 4740 

TGTGGGCCCG GGGCTCTTGG GGGTTTCGAC CCTGGGGCCC AAGAGGAACT CCATGGTTCT 48 00 

GGATGTGGCG TTCGTCCTGG AAGGATCGGA CAAAATTGGT GAAGCCGACT TCAACAGGAG 4860 

CAAGGAGTTC ATGGAGGAGG TGATTCAGCG GATGGATGTG GGCCAGGACA GCATCCACGT 4920 

CACGGTGCTG CAGTACTCCT ACATGGTGAC CGTGGAGTAC CCCTTCAGCG AGGCACAGTC 4 980 

CAAAGGGGAC ATCCTGCAGC GGGTGCGAGA GATCCGCTAC CAC-GGCGGCA ACAGGACCAA 5040 

CACTGGGCTG GCCCTGCGGT ACCTCTCTGA CCACAGCTTC TTGGTCAGCC AGGGTGACCG 5100 

GGAGCAGGCG CCCAACCTGG TCTACATGGT CACCGGAAAT CCTGCCTCTG ATGAGATCAA 5160 

GAGGCTGCCT GGAGACATCC AGGTGGTGCC CATTGGAGTG GGCCCTAATG CCAACGTGCA 5220 

GGAGCTGGAG AGGATTGGCT GGCCCAATGC CCCTATCCTC ATCCAGGACT TTGAGACGCT 52 80 

CCCCCGAGAG GCTCCTGACC TGGTGCTGCA GAGGTGCTGC TCCGGAGAGG GGCTGCAGAT 5340 

CCCCACCCTC TCCCCTGCAC CTGACTGCAG CCAGCCCCTG GACGTGATCC TTCTCCTGGA 54 00 

TGGCTCCTCC AGTTTCCCAG CTTCTTATTT TGATGAAATG AAGAGTTTCG CCAAGGCTTT 5460 

CATTTCAAAA GCCAATATAG GGCCTCGTCT CACTCAGGTG TCAGTGCTGC AGTATGGAAG 5520 

CATCACCACC ATTGACGTGC CATGGAACGT GGTCCCGGAG AAAGCCCATT TGCTGAGCCT 5580 

TGTGGACGTC ATGCAGCGGG AGGGAGGCCC CAGCCAAATC GGC-GATGCCT TGGGCTTTGC 5640 

TGTGCGATAC TTGACTTCAG AAATGCATGG TGCCAGGCCG GGAGCCTCAA AGGCGGTGGT 5700 

CATCCTGGTC ACGGACGTCT CTGTGGATTC AGTGGATGCA GCAGCTGATG CCGCCAGGTC 5760 

CAACAGAGTG ACAGTGTTCC CTATTGGAAT TGGAGATCGC TACGATGCAG CCCAGCTACG 5820 

GATCTTGGCA GGCCCAGCAG GCGACTCCAA CGTGGTGAAG CTCCAGCGAA TCGAAGACCT 58 80 

CCCTACCATG GTCACCTTGG GCAATTCCTT CCTCCACAAA CTGTGCTCTG GATTTGTTAG 5940 

GATTTGCATG GATGAGGATG GGAATGAGAA GAGGCCCGGG GACGTCTGGA CCTTGCCAGA 6000 
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CCAGTGCCAC ACCGTGACTT GCCAGCCAGA TGGCCAGACC TTGCTGAAGA GTCATCGGGT 6060 

CAACTGTGAC CGGGGGCTGA GGCCTTCGTG CCCTAACAGC CAGTCCCCTG TTAAAGTGGA 6120 

AGAGACCTGT GGCTGCCGCT GGACCTGCCC CTGCGTGTGC ACAGGCAGCT CCACTCGGCA 6180 

CATCGTGACC TTTGATGGGC AGAATTTCAA GCTGACTGGC AGCTGTTCT? ATGTCCTATT 6240 

TCAAAACAAG GAGCAGGACC TGGAGGTGAT TCTCCATAAT GGTGCCTGCA GCCCTGGAGC 63 00 

AAGGCAGGGC TGCATGAAAT CCATCGAGGT GAAGCACAGT GCCCTCTCCG TCGAGCTGCA 63 60 

CAGTGACATG GAGGTGACGG TGAATGGGAG ACTGGTCTCT GTTCCTTACG TGGGTGGGAA 6420 

CATGGAAGTC AACGTTTATG GTGCCATCAT GCATGAGGTC AGATTCAATC ACCTTGGTCA 6480 

CATCTTCACA TTCACTCCAC AAAACAATGA GTTCCAACTG CAGCTCAGCC CCAAGACTTT 6540 

TGCTTCAAAG ACGTATGGTC TGTGTGGGAT CTGTGATGAG AACGGAGCCA ATGACTTCAT 6600 

GCTGAGGGAT GGCACAGTCA CCACAGACTG GAAAACACTT GTTCAGGAAT GGACTGTGCA 6660 

GCGGCCAGGG CAGACGTGCC AGCCCATCCT GGAGGAGCAG TGTCTTGTCC CCGACAGCTC 6720 

CCACTGCCAG GTCCTCCTCT TACCACTGTT TGCTGAATGC CACAAGGTCC TGGCTCCAGC 6780 

CACATTCTAT GCCATCTGCC AGCAGGACAG TTGCCACCAG GAC-CAAGTGT GTGAGGTGAT 6340 

CGCCTCTTAT GCCCACCTCT GTCGGACCAA CGGGGTCTGC GTTGACTGGA GGACACCTGA 6900 

TTTCTGTGCT ATGTCATGCC CACCATCTCT GGTCTACAAC CACTGTGAGC ATGGCTGTCC 6960 

CCGGCACTGT GATGGCAACG TGAGCTCCTG TGGGGACCAT CCCTCCGAAG GCTGTTTCTG 7020 

CCCTCCAGAT AAAGTCATGT TGGAAGGCAG CTGTGTCCCT GAAGAGGCCT GCACTCAGTG 7080 

CATTGGTGAG GATGGAGTCC AGCACCAGTT CCTGGAAGCC TGGGTCCCGG ACCACCAGCC 7140 

: TGCACATGCC TCAGCGGGCG GAAGGTCAAC TGCACAACGC AGCCCTGCCC 7200 

\ GCTCCCACGT GTGGCCTGTG TGAAGTAGCC CGCCTCCGCC AGAATGCAGA 7260 

CCAGTGCTGC CCCGAGTATG AGTGTGTGTG TGACCCAGTG AGCTGTGACC TGCCCCCAGT 7320 

GCCTCACTGT GAACGTGGCC TCCAGCCCAC ACTGACCAAC CCTGGCGAGT GCAGACCCAA 7380 

CTTCACCTGC GCCTGCAGGA AGGAGGAGTG CAAAAGAGTG TCCCCACCCT CCTGCCCCCC 7440 

GCACCGTTTG CCCACCCTTC GGAAGACCCA GTGCTGTGAT GAGTATGAGT GTGCCTGCAA 7500 

CTGTGTCAAC TCCACAGTGA GCTGTCCCCT TGGGTACTTG GCCTCAACCG CCACCAATGA 7560 

CTGTGGCTGT ACCAGAACCA CCTGCCTTCC CGACAAGGTG TGTGTCCACC GAAGCACCAT 7620 

CTACCCTGTG GGCCAGTTCT GGGAGGAGGG CTGCGATGTG TGCACCTGCA CCGACATGGA 7680 

GGATGCCGTG ATGGGCCTCC GCGTGGCCCA GTGCTCCCAG AAGCCCTGTG AGGACAGCTG 7740 

TCGGTCGGGC TTCACTTACG TTCTGCATGA AGGCGAGTGC TGTGGAAGGT GCCTGCCATC 7800 

TGCCTGTGAG GTGGTGACTG GCTCACCGCG GGGGGACTCC CAGTCTTCCT GGAAGAGTGT 7860 

CGGCTCCCAG TGGGCCTCCC CGGAGAACCC CTGCCTCATC AATGAGTGTG TCCGAGTGAA 7920 

GGAGGAGGTC TTTATACAAC AAAGGAACGT CTCCTGCCCC CAGCTGGAGG TCCCTGTCTG 7980 

CCCCTCGGGC TTTCAGCTGA GCTGTAAGAC CTCAGCGTGC TGCCCAAGCT GTCGCTGTGA 8040 

GCGCATGGAG GCCTGCATGC TCAATGGCAC TGTCATTGGG CCCGGGAAGA CTGTGATGAT 8100 

CGATGTGTGC ACGACCTGCC GCTGCATGGT GCAGGTGGGG GTCATCTCTG GATTCAAGCT 8 ISO 

GGAGTGCAGG AAGACCACCT GCAACCCCTG CCCCCTGGGT TACAAGGAAG AAAATAACAC 8220 

AGGTGAATGT TGTGGGAGAT GTTTGCCTAC GGCTTGCACC ATTCAGCTAA GAGGAGGACA 8280 

GATCATGACA CTGAAGCGTG ATGAGACGCT CCAGGATGGC TGTGATACTC ACTTCTGCAA 8340 

GGTCAATGAG AGAGGAGAGT ACTTCTGGGA GAAGAG3GTC ACAGGCTGCC CACCCTTTGA 8400 

TGAACACAAG TGTCTGGCTG AGGGAGGTAA AATTATGAAA ATTCCAGGCA CCTGCTGTGA 8460 

CACATGTGAG GAGCCTGAGT GCAACGACAT CACTGCCAGG CTGCAGTATG TCAAGGTGGG 8520 

AAGCTGTAAG TCTGAAGTAG AGGTGGATAT CCACTACTGC CAGGGCAAAT GTGCCAGCAA 8580 

AGCCATGTAC TCCATTGACA TCAACGATGT GCAGGACCAG TGCTCCTGCT GCTCTCCGAC 8640 

ACGGACGGAG CCCATGCAGG TGGCCCTGCA CTGCACCAAT GGCTCTGTTG TGTACCATGA 8700 

GGTTCTCAAT GCCATGGAGT GCAAATGCTC CCCCAGGAAG TGCAGCAAG T GAG GCTGCTG 8760 

CAGCTGCATG GGTGCCTGCT GCTGCCTGCC TTGGCCTGAT GGCCAGGCCA GAGTGCTGCC 8 820 

AGTCCTCTGC ATGTTCTGCT CTTGTGCCCT TCTGAGCCCA CAATAAAGGC TGAGCTCTTA 8 880 
TCTTGCTGCA TGTTCTGCTC TTGTGCCCTT CTGAGCCCAC AAT 

sequence : 



; NP 000543.1 



1 11 21 31 41 51 

I I I I I I 

MIPARFAGVL LALALILPGT LCAEGTRGRS STARCSLFGS DFVNTFDGSM YSFAGYCSYL 60 

LAGGCQKRSF SIIGDFQNGK RVSLSVYLGE FFDIHLFVNG TVTQGDQRVS MPYASKGLYL 120 

ETEAGYYKLS GEAYGFVARI DGSGNFQVLL SDRYFNKTCG LCGNFNIFAE DDFMTQEGTL 180 

TSDPYDFANS WALSSGEQWC ERASPPSSSC NISSGEMQKG LWEQCQLLKS TSVFARCHPL 240 

VDPEPFVALC EKTLCECAGG LECACPALLE YARTCAQEGM VLYGWTDHSA CSPVCPAGME 300 

YRQCVSPCAR TCQSLHINEM CQERCVDGCS CPEGQLLDEG LCVESTECPC VKSGKRYPPG 360 

TSLSRDCNTC ICRNSQWICS HEECPGECLV TGQSHFKSFD NRYFTFSGIC QYLLARDCQD 420 

HSFSIVIETV QCADDRDAVC TRSVTVRLPG LHNS LVKLKH GAGVAMDGQD IQLPLLKGDL 4BO 

RIQHTVTASV RLSYGEDLQM DWDGRGRLLV KLSPVYAGKT CGLCGNYNGN CGDDFLTPSG 540 

LAEPRVEDFG NAWKLHGDCQ DLQKQHSDPC ALNPRMTRFS EEACAVLTSP TFEACHRAVS 600 

PLPYLRNCRY DVCSCSDGRE CLCGALASYA AACAGRGVRV AWREPGRCEL NCPKGQVYLQ 560 

CGTPCNLTCR SLSYPDEECN EACLEGCFCP PGLYMDERGD CVPKAQCPCY YDGEIFQPED 720 

IFSDHHTMCY CEDGFMHCTM SGVPGSLLPD AVLSSPLSHR SKRSLSCRPP MVKLVCPADN 780 

LRAEGLECTK TCQNYDLECM SMGCVSGCLC PPGMVRHENR CVALERCPCF HQGKEYAPGE 840 

TVKIGCNTCV CRDRKWNCTD HVCDATCSTI GMAHYLTFDG LKYLFPGECQ YVLVQDYCGS 900 

NPGTFRILVG NKGCSHPSVK CKKRVTILVE GGEIELFDGE VNVKRPMKDE THFEWESGR 960 

YIILLLGKAL SVWDRHLSI SWLKQTYQE KVCGLCC-NFD GIQNNDLTSS NLQVEEDPVD 1020 

FGNSWKVSSQ CADTRKVPLD SSPATCHNNI MKQTMVDSSC R1LTSDVFQD CNKLVDPEPY 1080 

LDVCIYDTCS CESIGDCACF CDTIAAYAHV CAQHGKWTW RTATLCPQSC EERNLRENGY 1140 

ECEWRYNSCA PACQVTCQHP EPLACPVQCV EGCHAHCPPG KILDELLQTC VDPEDCPVCE 1200 
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VAGRRFASGK KVTLNPSDPE HCQICHCDW NLTCEACQEP GGLWPFTDA PVS PTTLYVE 1260 

DISEPPLHDF YCSRLLDLVF LLDGSSRLSE AEFEVLKAFV VDMMERLRIS QKWVRVAWE 1320 

YHDGSHAYIG LKDRKRPSEL RRIASQVKYA GSQVASTSEV LKYTLFQIFS KIDRPEASRI 13 80 

ALLLMASQEP QRMSRNFVRY VQGLKKKKVI VIPVGIGPHA NLKQIRLIEK QAPENKAFVL 1440 

5 SSVDELEQQR DEIVSYLCDL APEAPPPTLP PHMAQVTVGP GLLGVSTLGP KRNSMVLDVA 1500 

FVLEGSDKIG EADFNRSKEF MEEVIQRMDV GQDSIHVTVL QYSYMVTVEY PFSEAQSKGD 15S0 

ILQRVREIRY QGGNRTNTGL ALRYLSDHSF LVSQGDREQA PKLVYMVTGN PASDEIKRLP 1620 

GDIQWPIGV GPNANVQELE RIGWPNAPIL IQDFETLPRE APDLVLQRCC SGEGLQIPTL 1680 

SPAPDCSQPB DVILLLDGSS SFPASYFDEM KSFAKAF1SK ANIGPRLTQV SVLQYGSITT 1740 

10 IDVPWNWPE KAHLLSLVDV MQREGGPSQI GDALGFAVRY LTSEMHGARP GASKAWILV 1800 

TDVSVDSVDA AADAARSKRV TVFPIGIGDR YDAAQLRILA GFAGDSNWK LQRIEDLPTM 1860 

VTLGNSFLHK LCSGFVRICM DEDGNEKRPG DVWTLPDQCH TVTCQPDGQT LLKSHRVNCD 1920 

RGLRPSCPNS QSPVKVEETC GCRWTCPCVC TGSSTRHIVT FDGQNFKLTG SCSYVLFQNK 1980 

EQDLEVILHN GACSPGARQG CMKSIEVKHS ALSVELHSDM EVTVNGRLVS VPYVGGNMEV 2040 

15 NVYGAIMHEV RFNHLGHIFT FTPQNMEFQL QLSPKTFASK TYGLCGICDE NGAMDFMLRD 2100 

GTVTTDWKTL VQEWTVQRPG QTCQPILEEQ CliVPDSSHCQ VLLLPLFAEC HKVLAPATFY 2160 

AICQQDSCHQ EQVCEVIASY AHLCRTNGVC VDWRTPDFCA MSCPPSLVYN HCEHGCPRHC 2220 

DGNVSSCGDH PSEGCFCPPD KVMLEGSCVP EEACTQCIGE DGVQHQFLEA WVPDHQPCQI 2280 

CTCLSGRKVN CTTQPCPTAK APTCGLCEVA RLRQNADQCC PEYECVCDPV SCDLPPVPHC 2340 

20 ERGLQPTLTN PGECRPNFTC ACRKEECKRV SPPSCPPHRL PTLRKTQCCD EYECACNCVN 2400 

STVSCPLGYL ASTATNDCGC TTTTCLPDKV CVHRSTIYPV GQFWEEGCDV CTCTDMEDAV 2450 

MGLRVAQCSQ KPCEDSCRSG FTYVLHEGEC CGRCLPSACE WTGSPRGDS QSSWKSVGSQ 2520 

WASPENPCLI NECVRVKEEV FIQQRNVSCP QLEVPVCPSG FQLSCKTSAC CPSCRCERME 2580 

ACMLNGTVIG PGKTVMIDVC TTCRCMVQVG VISGFKLECR KTTCNPCPLG YKEENNTGEC 2640 

25 CGRCLPTACT IQLRGGQIMT LKRDETLQDG CDTHFCKVNE RGEYFWEKRV TGCPPFDEHK 2700 

CLAEGGKIMK IPGTCCDTCE EPECNDITAR LQYVKVGSCK SEVEVDIHYC QGKCASKAMY 2750 
SIDINDVQDQ CSCCSPTRTE PMQVALHCTN GSWYHEVLN AMECKCSPRK CSK 

30 Seg ID NO: ISO DNA sequence 

Nucleic Acid Accession #: NM_001508.1 

Coding sequence: 1-1362 (underlined sequences correspond to start and stop codons) 

35 1 11 21 31 41 51 

I I I I I I 

ATGGCTTCAC CCAGCCTCCC GGGCAGTGAC TGCTCCCAAA TCATTGATCA CAGTCATGTC 50 

CCCGAGTTTG AGGTGGCCAC CTGGATCAAA ATCACCCTTA TTCTGGTGTA CCTGATCATC 120 

TTCGTGATGG GCCTTCTGGG GAACAGCGTC ACCATTCGGG TCACCCAGGT GCTGCAGAAG 180 

40 AAAGGATACT TGCAGAAGGA GGTGACAGAC CACATGGTGA GTTTGGCTTG CTCGGACATC 240 

TTGGTGTTCC TCATCGGCAT GCCCATGGAG TTCTACAGCA TCATCTGGAA TCCCCTGACC 3 00 

ACGTCCAGCT ACACCCTGTC CTGCAAGCTG CACACTTTCC TCTTCGAGGC CTGCAGCTAC 3 50 

GCTACGCTGC TGCACGTGCT GACGCTCAGC TTTGAGCGCT ACATCGCCAT CTGTCACCCC 420 

TTCAGGTACA AGGCTGTGTC GGGACCTTGC CAGGTGAAGC TGCTGATTGG CTTCGTCTGG 4 B0 

45 GTCACCTCCG CCCTGGTGGC ACTGCCCTTG CTGTTTGCCA TGGGTACTGA GTACCCCCTG 540 

GTGAACGTGC CCAGCCACCG GGGTCTCACT TGCAACCGCT CCAGCACCCG CCACCACGAG 6 00 

CAGCCCGAGA CCTCCAATAT GTCCATCTGT ACCAACCTCT CCAGCCGCTG GACCGTGTTC 6S0 

CAGTCCAGCA TCTTCGGCGC CTTCGTGGTC TACCTCGTGG TCCTGCTCTC CGTAGCCTTC 720 

ATGTGCTGGA ACATGATGCA GGTGCTCATG AAAAGCCAGA AGGGCTCGCT GGCCGGGGGC 780 

50 ACGCGGCCTC CGCAGCTGAG GAAGTCCGAG AGCGAAGAGA GCAGGACCGC CAGGAGGCAG 840 

ACCATCATCT TCCTGAGGCT GATTGTTGTG ACATTGGCCG TATGCTGGAT GCCCAACCAG 900 

ATTCGGAGGA TCATGGCTGC GGCCAAACCC AAGCACGACT GGACGAGGTC CTACTTCCGG 950 

GCGTACATGA TCCTCCTCCC CTTCTCGGAG ACGTTTTTCT ACCTCAGCTC GGTCATCAAC 1020 

CCGCTCCTGT ACACGGTGTC CTCGCAGCAG TTTCGGCGGG TGTTCGTGCA GGTGCTGTGC 1080 

55 TGCCGCCTGT CGCTGCAGCA CGCCAACCAC GAGAAGCGCC TGCGCGTACA TGCGCACTCC 1140 

ACCACCGACA GCGCCCGCTT TGTGCAGCGC CCGTTGCTCT TCGCGTCCCG GCGCCAGTCC 12 00 

TCTGCAAGGA GAACTGAGAA GATTTTCTTA AGCACTTTTC AGAGCGAGGC CGAGCCCCAG 1260 

TCTAAGTCCC AGTCATTGAG TCTCGAGTCA CTAGAGCCCA ACTCAGGCGC GAAACCAGCC 1320 
AATTCTGCTG CAGAGAATGG TTTTCAGGAG CATGAAGTT T GA 



1 11 21 31 41 51 

I I I I I I 

MASPSLPGSD CSQIIDHSHV PEFEVATWIK ITLILVYLII FVMGLLGNSV TIRVTQVLQK 
KGYLQKEVTD HMVSLACSDI LVFLIGMPME FYSIIWNPLT TSSYTLSCKL HTFLFEACSY 
ATLLHVLTLS FERYIAICHP FRYKAVSGPC QVKLLIGFVW VTSALVALPL LFAMGTEYPL 
VNVPSHRGLT CNRSSTRHHE QPETSNMSIC TNLSSRWTVF QSSIFGAFW YLWLLSVAF 
MCWNMMQVLM KSQKGSLAGG TRPPQLRKSE SEESRTARRQ TIIFLRLIW TLAVCWMPNQ 
IRRIMAAAKP KHDWTRSYFR AYMILLPFSE TFFYLSSVIN PLLYTVSSQQ FRRVFVQVLC 
CRLSLQHANH EKRLRVHAHS TTDSARFVQR PLLFASRRQS SARRTEKIFL STFQSEAEPQ 
SKSQSLSLES LEPNSGAKPA NSAAENGFQE HEV 

Seq ID NO: 152 DNA sequence 
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Nucleic Acid AcceGsion #: none found 

Coding sequence: 3-65 (underlined sequences correspond to start and stop codons) 



1 11 21 31 . 41 51 

I I I I I I 

TT ATT ATTTT GTGTAAACTA TATTCTGCTT ATAGAGAGTC TCTGAGACTA AAATTGACAA 60 

CTTGAAAAGT ATTCCAAGGA ATATTATGAA AATAGGGCAft CATGGACTGT TTAAGATCTC 120 

CATGTAATTG AAATTCATGC AAGGAAACAA CTCATAGAAA AGATAAATAT GGATGCCCTT 180 

CACATGTTAT CAACCTCGTA ACTTTTGGTG CTTGCTGAAT CAGTCCATGA AAAGCTACAG 24 0 

CCCGCTCTTT GGGAATGCTA CATACCCATT TCTGGTATTT AAAAAATATC TAGGAGGAGC 300 

TAAATGACAA AACACAGCAG TGTTTTGAGG GAGAAAGGAC CATCATTTAT AATGCTCTGT 36 0 

ACATACTACC AGAGCTGCTT GGAAAATTAA AGGCCACTTG TGGCTTTTTC CTACCAACTG 42 0 

ATACGTTTAA ATTTGCCCTA GGATTSAGCT AACAGCAAAA AAAAAAAAAA AAAAAAAARA 480 

GAGAGAAAGA AAGGAGKAAA CAGTGGTAAT AAAAAAATCC ATCTGTCTTC TTGCTATGTT 540 

AATATTAATA AATCATAATA TGACAAGACC CTCACTGAAT AAGAGTATTT TCAGTCATCA 600 

GAAGCCAGCT GTTGGTAGGC ATTAATGAGT TTAAAATTGT TCTCAATTGA AAAAACATCA 660 

CACTATTTTG CCAAAACCAA AGTAATTATA ATACTGTGTC CTCCTGTAAT TTTTTGAGAA 72 0 

GTGGTTATAA AGGGCATATT TACATAAATT CTACTTTATT CCTCAACTTC TTTGATGAAT 78 0 

GTAACCCAAT TTTACTTCTT TAAAAAGTCT CAATTCAAGC TGGATTAGCC AGCTCAGCAT 84 0 

AATCAACTAG ACAGTGGTTT GTTAAATTTA GCAGCATACT TCGTTCCCAT TCTAATTAAA 90 0 

GTCATGAGTT CTTGAATCCC AGAGAAATAA TGCTTAGGAA CTTCTCTCAA TCTGCTTGGC 960 

TTGGCCTAGA GAAGTGGCCA TTTTATCAAC AGGRAAAAAA AAAATTTTCT CTACTACAAC 102 0 

CCCGTTGCCT TCTGAAAAAC AGCAAGTTAT TTCTTTATAT AATTATCATT TTATTATTTT 10 8 0 

ATGGAAAATT AATTTATTAA TTAATAGCCT ATTATGTGTT CTCACTTGCT TCTCTAAGTA 114 0 

ATATTTTGAG ATAAAATGTT GAATAAAACC ATGGATTATA GAGAAAAGTC AAAATATATG 1200 
TGTAATATTT AATTATTTTA TAAGTTTTAT AATAAAGTAT TCCATTTCTT TATCTT 



Seq ID No: 153 Protein sequence: 
Protein Accession #: none found 



1 11 21 31 41 51 

I I I I I I 

IILCKLYSAY RESLRLKLTT 

Seq ID NO: 154 DMA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 1-3S (underlined sequences correspond to start and stop codons) 



CTGGATGATA TGGAAGAAAT GGATGGGTTA 

CTCAAGGTTA AAATAGTTTA AGTGCCAGAA 

ATCTTTGAAT GGTCCCCTTG GTTAAATACT 

AAATGAAGAG GAATTCACTA ATATGCTACG 

TGTTGTGTAG TTCTCTGTTC CAGGGCTGCC 

CCCTGTACAA CCATACTGCC TCTCAACACT 

TTCCTGGGAG ATATAAAATA CATAGGTTTA 

Seq ID No: 155 Protein sequence: 
Protein Accession #: none found 



31 41 51 

I 1 I 

AGGTAAAAGG CTGATCACAG ATGGGTTCCT 60 

GAAAAGGTGG GCACCAGCGA ATTAAGAACC 120 

TAACTTTTGT CATCAGTGTC TGCATTTATG 180 

TGATCTTTTG TTTC-TCATGA AAAGAGTTAC 240 

TTTGCTCCAC AAAGCACTGA GAAGCAGTGG 3 00 

GTGTAATAGG CTAACACCGC CCAGCGAACC 3 60 
GGCTGGCA^A AAAAAAAAAA AAA 



Seq ID NO: 156 DNA sequence 

Nucleic Acid Accession #: NM_032961.1 

Coding sequence: 827-3949 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I 1 I 1 

CAGGCTCAGA GGCTGAAGCA GGAGGAAGGA AGGACTGGAA GGAAAAAGAO ACAGGTTAGA 60 

GGGAAAGAGG CTTGGGAAGA AAACAGCAGA AAAGAAACTG CTCATTACAC TTACAGAGAG 120 

GCAAGTAACG GTGGAGATGA GGACAGAGGG AACCAAGACT CTGAAAGACA AAAAATACAA 180 

ATAGAGCGAA AGAGGAAAAA AATGTCAAGA AGAACATCCA TCCGGAGAAA TGAAGAGAAT 240 

GAAAGTTTTA AACTGCAGAG CCGTTCTGTG CTTTTCCGGC ACAAAATTAT ATCGCTGATT 300 

TTAAGCCCTT TTGCATTTGC CAGCCGTTGA CATTAAGAGG CATGTTTAAC GGTGCCAACA 360 

GCATCTCCTT TTCCTTCTCC TCTTCCTCTT CTTCTTCTTC CTCCTCCTCC TCCTCTTTTT 420 
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CCTCCTCCTC GTTCTCCTCC CATCAGCAAG 
GAAATTTCCT CTTTGGGATT TGCCAGCGCC 
TGTATTATTG TTATTTTATT AATTAGTCAG 
CCTGTCACCC TTCCTGTGCT AAGATTTAAA 
5 AAAATGAAGC AAAAGGAGTA AGATTTTTAA 
CGCACTTTTA TTTGTATTTT TTCAGATTTT 
TGGGTGGCTG ACTGGCTGCG GGAAGCTACT 
ATTGTTTGCC TTGCTCTGGA TGGTGGAAGG 
GGAGGAGCAG GAACATGGCA CTTTCGTGGG 

10 TACAAAACTT TCGGCTCGCG GGTTTCAGAC 
CCTCAACCTG GAGACAGGGG TGCTGTACGT 
CAAACAGAGC CCCTCCTGTG TCCTGCACCT 
GTTCCAGGTG GAGATCGAGG TGCTGGACAT 
AGACCTGACG GTGGAAATCT CTGAGAGCGC 

15 CGCATTCGAC CCAGACGTGG GCACCAACTC 
CTACTTCTCC CTGGACGTGC AGACCCAGGG 
GGAGAAGCCA CTGGACCGAG AGCAGCAAGC 
CGGAGGAGGT GGGGGAGGAG TAGGAGAAGG 
CCCCCAGCAG CAGCGCACCG GCACGGCCCT 

20 CAATGTGCCC GCTTTCGACC AACCCGTCTA 
AGGCACTCTC GTGATCCAGC TCAACGCCAC 
CGTGTACTCC TTCAGCAGCC ACATTTCGCC 
GCGCACTGGC AGACTGGAGG TAAGCGGCGA 
AGTGTACGTG CAAGCCAAGG ACCTGGGCCC 

25 AGTGCGAGTA CTGGATGCTA ATGACAACGC 
AGCGGTGAGT GAGGGCGCGG CGCCCGGCAC 
CGACTCAGAG GAGAATGGGC AGGTGCAGTG 
CAAGTCTTCC TTTAAGAATT ACTACACCAT 
GGGGGACTCC TACACCCTGA CTGTAGTGGC 

30 CAGTAAGTCG ATCCAGGTAC AAGTGTCGGA 
GCCGGTCTAC GACGTGTATG TGACTGAAAA 
GAGCGCCACC GACCGGGATG AGGGCGCCAA 
CCAGATCCAG GGCATGAGCG TCTTCACCTA 
GTACGCCCTG CGCTCCTTCG ACTATGAGCA 

35 CCGGGACGCT GGCAGCCCCC AGGCGCTGGC 
GGATCAAAAT GACAACGCCC CTGCCATCGT 
AGCGCGTGAG GTGCTGCCCC GCTCGGCGGA 
CGTGGACGCG GACGACGGCG AGAACGCCCG 
AATGAACCTC TTTCGCATGG ACTGGCGCAC 

40 GGCCAAGCGC GACCCCCAGC GGCCTTATGA 
GCCGCCCCTT TCCTCCACCG CCACCCTGGT 
CCAGGGCGGG GGCGGGAGCG GAGGCGGAGG 
TGGCGGCGGG GAAACCTCGC TAGACCTCAC 
GTCCTTCATC TTCCTGCTGG CCATGATCGT 

45 GCTCAACATC TATACTTGTC TGGCCAGCGA 
CGGAGGTTCG ACCTGCTGTG GCCGCCAAGC 
AGACATCATG CTGGTGCAGA GCTCCAATGT 
GGAGTCCGGG GGCTTTGGCT CCCACCACCA 
GACCCCTGAG TCCGCCAAGA CCGACCTGAT 

50 TACGGACACT GAGCACAACC CCTGCGGGGC 
TGATATCATC TCCAACGGAA GCATTTTGTC 
CAGCTATCTA GTTGACAGAC CTCGCCGAGT 
AGTAAGCTCT AAGGACAGTG GTCATGGAGA 
CACCAACCGT GCCCAGTCAG CTGGTATGGA 

55 AGCTCTGGGC CACTCAGATC GGTGCTGGAT 
GGCTGCTGAT TATCGCAGCA ATCTGCATGT 
GGTGTTTGAA ACTCCAGAAG CCCAGCCTGG 
AGAGAAGGCC CTTCACAGCA CTCTGGAGAG 
GCGAGCGCCT TACAAACCAC CATATTTGAC 

60 GGACTTACCT GAAGCAGCAT GATTTGCACA 
ACTTCATTAT CTTGGCCATC CAGTTAGTCA 
GTCATCATGG CCAATTATAG GACCTAATTG 
TGTGCAGAAC TGTAGAAACT TTAGAGGCAA 
TGTTTACAGC ACTATCTATC TTTCTCTCTC 

65 ATTCACCACG AGAAGCCAGT CATAAAGATA 
CACTGTTTTA AACTTGACTG TTTTATATTA 
TTCCAACTTT ACAAGAGAAA TTGTGATTAT 
TTGTATTCTG AAGACCCACA AAATATCAAA 
AAGTGTTTAC TGTACTATTT CAAAGCTTCT 

70 TATAATTTTC CTAAAATGTG GTACAACTCA 
CATCATACAA TAAAATAAAA GGTAATTCAG 

TCATTAATAG ttttctccca atttccatat 

AGAAAATGAT GCTCTAAGCT ACAAAATTTT 
AAAGATGTAG CTATTGATGT TATCAGACAG 
75 ACAATCTGCA TAAGTCTGAT TCTATTTCTA 
TTATAAAGAA TCGATAAATT CACCTGTATT 



AAGACAAACC GAGGACAGTC TTGAAATATC 480 

AAGACTGTCG GAATAAAGGA CGCTGACTAT 54Q 

TGGAAAGATT ACAGATGAGG AAAGGGGACG 600 

AAAAAATGAG GCTGGATTGC GGGAAGCTCT 560 

AGACAGAAAG CCACAGGAGC CCCCACGTAG 720 

TTTTTGTTTC GTGGTGGTGG GGGAGGTGAT 780 

TCCTTTCCTT TTGGAG ATG A TTGTGCTATT 840 

AGTCTTTTCC CAGCTTCACT ACACGGTACA 900 

GAATATCGCT GAAGATCTGG GTCTGGACAT 960 

GGTGCCCAAC TCAAGGACCC CTTACTTAGA 1020 

GAACGAGAAA ATAGACCGCG AACAAATCTG 1080 

GGAGGTCTTT CTGGAGAACC CCCTGGAGCT 1140 

TAATGACAAC CCCCCCTCTT TCCCGGAGCC 1200 

CACGCCAGGC ACTCGCTTCC CCTTGGAGAG 1260 

CTTGCGCGAC TACGAGATCA CCCCCAACAG 1320 

GGATGGCAAC CGATTCGCTG AGCTGGTGCT 1380 

GGTGCACCGC TACGTGCTGA CCGCGGTGGA 1440 

AGGGGGAGGT GGCGGGGGAG CAGGCCTGCC 1500 

ACTCACCATC CGAGTGCTGG ACTCCAATGA 1560 

CACTGTGTCC CTACCAGAGA ACTCTCCCCC 1620 

CGACCCGGAC GAGGGCCAGA ACGGTGAGGT 1680 

CCGGGCGCGG GAGCTTTTCG GACTCTCGCC 1740 

GTTGGACTAT GAAGAGAGCC CAGTGTACCA 1800 

CAACGCCGTG CCTGCGCACT GCAAGGTGCT I860 

GCCAGAGATC AGCTTCAGCA CCGTGAAGGA 1920 

TGTGGTGGCC CTTTTCAGCG TGACTGACCG 1980 

CGAGCTACTG GGAGACGTGC CTTTCCGCCT 2040 

CGTTACCGAA GCCCCCCTGG ACCGAGAGGC 2100 

TCGGGACCGG GGCGAGCCTG CGCTCTCCAC 2160 

TGTGAACGAC AACGCGCCGC GTTTCAGCCA 2220 

CAACGTGCCT GGCGCCTACA TCTACGCGGT 2280 

CGCCCAGCTT GCCTACTCTA TCCTCGAGTG 2340 

CGTTTCTATC AACTCTGAGA ACGGCTACTT 2400 

GCTGAAGGAC TTCAGTTTTC AGGTGGAAGC 2460 

TGGTAACGCC ACTGTCAACA TCCTCATAGT 2520 

GGCGCCTCTA CCAGGGCGCA ACGGGACTCC 2580 

GCCGGGTTAC CTGCTCACCC GCGTGGCCGC 2640 

GCTCACTTAC AGCATCGTGC GTGGCAACGA 2700 

CGGGGAGCTG OGCACAGCAC GCCGAGTCCC 2 760 

GCTGGTGATC GAGGTGCGCG ACCATGGGCA 2820 

GGTTCAGCTG GTGGATGGCG CCGTGGAGCC 2380 

GTCAGGAGAG CACCAGCGCC CCAGICGCTC 2 940 

CCTCATCCTC ATCATCGCGT TGGGCTCGGT 3 000 

GCTGGCCGTG CGTTGCCAAA AAGAGAAGAA 3060 

TTGCTGCCTC TGCTGCTGCT GCTGCGGTGG 3120 

CCGGGCGCGC AAGAAGAAAC TCAGCAAGTC 3180 

ACCCAGTAAC CCGGCCCAGG TGCCGATAGA 3240 

CAACCAGAAT TACTGCTATC AGGTATGCCT 3300 

GTTTCTTAAG CCCTGCAGCC CTTCGCGGAG 3360 

CATCGTCACC GGTTACACCG ACCAGCAGCC 3420 

CAACGAGACT AAACACCAGC GAGCAGAGCT 3480 

TAACAGTTCT GCATTCCAGG AAGCCGACAT 3 540 

CAGTGAACAG GGAGATAGTG ATCATGATGC 3 500 

TCTCTTCTCC AATTGCACTG AGGAATGTAA 3660 

GCCTTCTTTT GTCCCTTCTG ATGGACGCCA 3720 

TCCTGGCATG GACTCTGTTC CAGACACTGA 3 780 

GGCAGAGCGG TCCTTTTCCA CCTTTGGCAA 3840 

GAAGGAGCTG GATGGACTGC TGACTAATAC 3 900 

ACGGAAAAGG ATATGCTAGT CAATTCTACA 3 960 

AAGTCGACCA ACAAAAGCAT CAACTTTTCA 4 020 

TGTGTAACTG AGTATTAGAT TTCGGATGGA 4080 

CTCTCAGCAG GCCTGAGAAA TGAGTTGAAA 4140 

CAGATTTTGC CTCCCCGATC AGTGTGTGCC 4200 

CAAATGTCAC TGAGCCCTTT AGATGTTTAT 4260 

AAGGAAATTT GTGCATTATA AATGCAATAT 4320 

TTTTTGTGTG ATCAAGTGTT CCGCAAGCTA 4380 

GTTCTTTTCA CCTGTGGGTT ATAAAAAATG 4440 

GACATTCTGT AGTTTATACA CCGTGTTGCA 4500 

AAATAAATAT AAAATATATA TATTATATTA 4560 

GTTGGTTTTT AAATG3ATGC ATACAGTCCA 4 520 

GGTCCCAAAG ACAAACTTAC TAAGAAAAAA 4680 

CTTACTCAAC CGTGTTTTTC CTTGTTTAAA 4740 

GTCAAAAACT CATATTGAAT TTTCAATGCC 4800 

AGCACTGACT ATGTACTATC AAACTATCTA 4860 

TGACTTTGAA TTTAGAATCA CTTAAAGCTT 4920 

TGTTGTTAGA AAAAAACTGG GTGTCTGTAC 4980 
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ATTTTGTGGT GTAAAATATG TAATTGAAGA TTACTATTTT AAGAAGTCAT CAGTCATATC 504 0 

ACTCACACAG AATTTTATTT TACATAGTTT TGTGACTTAA TTACACATGA ATATAAAATC 5100 

TATAATTCTA TATGAATATA TAGAGATATA GAAACATCTG AACTGGTAAA GAATAACTAT 5160 

AAAATATGAA AGCTCTAAAT TTAAAATAAA TTTAGAGATA GAATCATGGT ACATTATTGT 5220 

TTCAGTATTC CATGTAAAAA TTTTATAGCT TAAATGTAGT CAGTGTTTGA TTAATGAAAA 52 80 

AATTCTTCAT GAGTCAGCCT TCAAAAGTTA AGCTTGCCTT TTACTTTTAT GTCAACAATA 534 0 
TTAATTATTA AATTTAGTAA GACGCAAAAA AAAAAAAAAA AAAA 

Seq ID No: 157 Protein sequence: 
Protein Accession #: NP_116586.1 



MIVLLLFALL 
TPYLDLNLET 
SFPEPDLTVE 



I 



GVLYVNEKID 
ISESATPGTR 
REQQAVHRYV 
DQPVYTVSLP 
EVSGELDYEE 
AAPGTWALF 
LTWARDRGE 



PQALAGNATV 
GENARLTYSI 
TATLWQLVD 
LAMIVLAVRC 
QSSNVPSNPA 
NPCGAIVTGY 



REQICKQSPS 
FPLESAFDPD 
LTAVDGGGGG 
ENSPPGTLVI 
SPVYQVYVQA 
SVTDRDSEEN 
PALSTSKSIQ 
SILECQIQGM 
NILIVDQNDN 
VRGNEMNLFR 
GAVEPQGGGG 
QKEKKLNIYT 
QVPIEESGGF 
TDQQPDIISN 
SDHDATNRAQ 
VPDTEVFETP 



GTFVGNIAED 
CVLHLEVFLE 
VGTNSLRDYE 



QLNATDPDEG 
KDLGPNAVPA 
GQVQCELLGD 
VQVSDVNDNA 
SVFTYVSINS 
APAIVAPLPG 
MDWRTGELRT 
SGGGGSGEHQ 
CLASDCCLCC 



I I 

LGLDITKLSA RGFQTVPNSR 60 
NPLELFQVEI EVLDINDNPP 12 0 
ITPNSYFSLD VQTQGDGNRF 18 0 
GAGLPFQQQR TGTALLTIRV 24 0 
QNGEWYSFS £ 
HCKVLVRVLD i 
VPFRLKSSFK N 
PRFSQPVYDV Y 
ENGYLYALRS f 



ARRVPAKRDP 



ANDNAPEISF 
NYYTIVTEAP 
YVTENNVPGA 
FDYEQLKDFS 
PRSAEPGYLL 
QRPYELVIEV 
SLDLTLILII 



LDREAGDSYT 
YIYAVSATDR 
FQVEARDAGS 
TRVAAVDADD 
RDHGQPPLSS 
ALGSVSFIFL 
KLSKSDIMLV 
SPSRSTDTEH 
QEADIVSSKD 
SDGRQAADYR 
LLTNTRAPYK 

Seq ID NO: 158 DNA sequence 

Nucleic Acid Accession #: NM_022159.1 

Coding sequence: 70-1890 (underlined sequences correspond to start and stop codons) 



GSILSNETKH 
SAGMDLFSNC 
EAQPGAERSF 



CCCGGGGSTC 
YQVCLTPESA 
QRAELSYLVD 



KTDLMFLKPC 
RPRRVNSSAF 
DRCWMPSFVP 
STLERKELDG 



GTGAAATTTA 
TATTATTGTA. 
AATGATGGAA 
ATAGCTGCAA 
TTGCTACAAG 
TATATAGAAA 
GCCAAGGACA 
GTTCAAAGGG 
CTTACAAAAC 
AAGACCACAG 
TCATATAACA 
TTTCCAAAGA 



AACTCCAGTC 
TGTGTGTACC 
CCGTCTGTAT 



TATGATAATT 
TCAAACCCAC 
GTCACAGATA 
GGCAGCTGGT 
CGCTGTAATC 
AAAGATTATA 
CTTGCCATAT 
ATTCACAAAA 
AATACAAATA 



AAGTCTATAG 
TATTAGCTGA 
CCCTTTCTAA 
ATACATTTGT 
TCATGCACAC 
AGTTTGATAC 
TGAAACATAT 
GAAAAGCTGC 
TTGGTCCTTT 
GTGAAGAGGA 
CCACATTATA 



GTCATCTACA 
GCCGTGGTAG 
TGTTGGCTTA 
ATTCTTGTTA 
GGGTTGAAAC 
GCTCTTCTGT 
TCAGTGGTTA 
TTATTCCTGT 
GTCCCCTGTT 
ACAAAAATAA 
CCAATTATTA 
AACTGTAGAT 
AATAGTTCTG 



CTTCAGAGGG 
ACCTGACACA 
ATATTCTTAC 
GCATTTTTAC 
ATCTTTGCTG 
CTAATAAGCT 
TTGCATGGAT 
ACAAGGGATT 
TTGGATTTTC 
GCACCGAAAA 
ATCTCTTGGC 
CAGAAGTTAG 
TCCTTCTCGG 
CAGCTTACCT 
GTGTTTTATC 
GTTTTGGATG 
AAATTCCAAG 
ACTACTAGAC 
AATAAGGTAA 
TCAAAAATAG 



21 

I 

CTGTGGCGAA 
TGGCTTCAGA 
AGAAAATGTG 
AACTTTAACA 
AAATTCTGTG 
ATCATCTTCA 
CTCAACTCTT 
AGTTTGGGAC 
TGTTGAACAA 
AAATTCAACG 
TCATCCTCAT 
ATATGATTCA 
GCTTTCATCA 
GGAAAGAGTC 
TGAACTTGAA 
TCTATGTGCA 
CTGTGAGCTG 
TTTTGCAATT 
AAGGATCACT 
CTTCTGGTTC 
TAGCCTATTT 
CTTCTGTTCA 
GTGCATTGAA 
TTTGCACAAG 
GGCAGCACTA 
CAACTTTATT 
TTTTGGAGTC 
TTGCTTTGAG 
CACCACCTGG 
CTTCACAGTC 
TAGAAAGATT 
TTTAAGG TAA 
CTGTGGATGA 
AAAAAGTATT 
AATTATGTAT 
TATTGCAGAT 



AATGCTAATT 



AATGCAAACT 
AAAATCAGAT 
ACAGATCTTT 
TTACTAGGTT 
ACTGAATTTG 
AAGTTATCTG 
GCTACTTTAA 
GATATAGCTC 
ATGAATATGG 
AATGGCAATG 
TCTGACAACT 
ATATCTTCAG 
AAAATAACAT 
TTTTGGAATT 
ACATACTCAA 
TTGATGTCCT 
CAACTAGGAA 
TTCAGTGAAA 
CTTGCTGAAC 
ATCATTGCCG 
GGCATACATC 



GCACTAACAC 
ACCAAGACAG 
GCCATTTAGA 
CCATAAAAGA 



ACAAGAACAA 
TAAAAACCGT 
TGAATCATAG 
GGATATCCCA 
TCAAAGTTTT 
ATGGAGACTA 
TTGCAGTTGC 
TCTTATTGAA 
TAATITCAGT 
TTACATTAAG 
ACTCACCTGA 
ATGAGACCCA 



TAATTATTTC 
TTCAAAGCAC 
TTGTTTTTCT 



GGATACAGAT 
TGGAGTTTTA 
ATCATATACA 
AACATAAGGT 
ATCTTTGGGG 
AGCAATGCTT 
CAAGAAGAAT 
ACATAGAGAA 
CCAATGTATA 
TTAAATCAGT 
CATATAGATA 
ATTTGGAAAG 



TCTATCTCAT 
TCTTTGGCTA 
ATTATGGCAC 
TAGGACCAGC 
AAGTTTTTCG 
CTTGTGCAAG 
TTCTCCATGT 



TGGTGGATAA 
AAAATGACTC 
TTTTCTGTTT 

TACTATGTTT 
TAATTGGTTT 



AGAAGGAAGT 
GTTTATCACT ' 
TAATGTCTGT 
ACCTGTGGCT 
TATAATTACA 
CACTATCTCA 
GAATAATTTT 
GAGAACACAT 
GAGCTTCCAA 
CTTTTTTGAT 
CATAAATATA 
ATTTTTATAT 
ACCTCAAAAT 
CTCAATGAGC 
TCATCGAAAG 
TACCATGAAT 
CACCTCATGC 
CATTGGTATT 
ACTGATTTGT 
CAGGACAACA 
TGTTGGGATC 
CTACTTCTTT 
TGTTGTGGGT 
TCTAAGCCCA 
AACCAAAGTA 
ATGCCTAATC 
TCACACTGCA 
AGGAGCCCTC 
TGTGCACGCA 
GTTCATTTTT 
GTTCAAAAAT 
TTACAACTGC 
ATCAAATTAT 
ATGCTATAGG 
TTCTATGTGA 
CTCAGGAGTG 



1800 
1860 
1920 
1980 
2040 
2100 
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ATATCACTGC ACCCAAGGAA AGATTTTCTT TCTAACACGA GAAGTATATG AATGTCCTGA 
AGGAAACCAC TGGCTTGATA TTTCTGTGAC TCGTGTTGCC TTTGAAACTA GTCCCCTACC 
ACCTCGGTAA TGAGCTCCAT TACAGAAAGT GGAACATAAG AGAATGAAGG GGCAGAATAT 
CAAACAGTGA AAAGGGAATG ATAAGATGTA TTTTGAATGA ACTGTTTTTT CTGTAGACTA 
GCTGAGAAAT TGTTGACATA AAATAAAGAA TTGAAGAAAC ACATTTTACC ATTTTGTGAA 
TTGTTCTGAA CTTAAATGTC CACTAAAACA ACTTAGACTT CTGTTTGCTA AATCTGTTTC 
TTTTTCTAAT ATTCTAAAA 



1 11 21 31 41 51 

I I I 1 I I 

MCVPGFRSSS NQDRFITNDG TVCIENWAN CHIjDNVCIAA NINKTLTKIR SIKEPVALLQ 

EVYRNSVTDL SPTDIITYIE ILAESSSLLG YKNNTISAKD TLSNSTLTEF VKTVNNFVQR 

DTFWWDKLS VNHRRTHLTK LMHTVEQATL RISQSFQKTT EFDTNSTDIA LKVFFFDSYN 

MKHIHPHMNM DGDYINIFPK RKAAYDSNGN VAVAFLYYKS IGPLLSSSDN FLLKPQNYDN 

SEEEERVISS VISVSMSSNP PTLYELEKIT FTLSHRKVTD RYRSLCAFWN YSPDTMNGSW 

SSEGCELTYS NETHTSCRCN HLTHFAXLMS SGPSIGIKDY NILTRITQLG IIISLICLAI 

CIFTFWFFSE IQSTRTTIHK NLCCSLFLAE LVFLVGINTN TNKLFCSIIA GLLHYFFLAA 

FAWMCIEGIH LYLIWGVIY NKGFLHKNFY IFGYLSPAW VGFSAALGYR YYGTTKVCWL 

STENNFIWSF IGPACIiIILV NLLAFGVIIY KVFRHTAGLK PEVSCFEMIR SCARGALALL 

FLLGTTWIFG VLHWHASW TAYLFTVSNA FQGMFIFLFL CVLSRKIQEE YYRLFKNVPC 
CFGCLR 



3 start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

TGT CTGCTTA TGCGGTGGCT CGCTGCTCAG AACAGGATGG CAGAGATGAG CACCACCATC 
AAAAACTCAA GGACCAGTGC TGTGGGTCCA GTCATCTGTT TCATGGA^TT CACCAGTCTG 
GTATCTTCAA AATCCAGAAG GATGATGGCA GATGGCAGGA AGGAGGAAGA GGGTAATCTG 
GAAGAGTTTC CTGACCTACT CTGCTGCTGT GATTAAACAA CCACCAGGAA ATTTTGATGA 
CACTGTTCTC CTGAGCTCCT CCCTTTCCTC GGGGAAGAAA AGCATTGAAA CTACAAAAAT 
AAAGTGTTAT TTGGCTGGAG TGAGGTCTCA TGTCTGCTTA TGCGGTGGCT CGCTGCTCAG 
AACAGGGAAC CATTGGAGAT ACTCATTACT CTTTGAAGGC TTACAGTGGA ATGAATTCAA 
ATACGACTTA TTTGAGGAAT TGAAGTTGAC TTTATGGAGC TGATAAGAAT CTTCTTGGAG 
AAAAAAAGAC TGGTACTTCT GAATTAACCA AAATCACAGT ATTCTGAAGA TGATTCTACA 
AAGCCTGCTG TTTCTACAAA GGCTGCTGAT GATTTCTACA AAGCCTGCTG TAGTGTTGCT 
GTGGCCTCTG CTTAAAAAAG TAGAAAACAC ATTGATGCAG CATGTTCACC CCAACCTCCC 
TGCCTAAAGG CTCAGGGACC ATCTTGGAAG AGGAAGGCGC GTGAGATTGT AAGAGCCGAA 
TTAGGGGGAT GGAGTGTGGA GAATAAGGAC ACTTCATCTT GGATGCTCAC CTGCCAAATT 
GACTTCTGAT GAAAGCCAGC TCCAGAAATG TGCCTACAGT TACTACTTTC ACCTAAACCC 
TGCCCTTAGT CAAATCCTTC TCTTCTTCTA AGCAATCAAC TTCAATTCCT TGTATAACCC 
ACAGTATAAA AGGGCTTTTA TACCATTCTA TCCTATTGCA TGTAAGCCTT GGGTCTGGGA 
GGTAACAGTG TGGGATTCCA CCATCTCATC TCCCTGCCAC CCAAACATGC CTGCTCTTCT 
TTAAGCAATA TTAAATGTTT GTACTTCA 



I I I I I I 

CLLMRWLAAQ NRMAEMSTTI KNSRTSAVGP VICFMEFTSL VSSKSRRMMA DGRKEEEGNL 60 
EEFPDLLCCC D 

Seg ID NO: 162 DNA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 1-159 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GAgACCCTCC AGAGGCAGGG CCCAGGATTG AAGAGGGAAG CCCTGCTCCA CACGTGTTCA 60 

TCAGGAAGGA CCCACAGACT GCTGCTCCTG GAGGCCTCTC G3TTTATGGA TGTGTGTTTG 120 

TTCCATAAAC CCTCAGAGGG TCACCTGGAG ACCCGCTAAA ATGCAGGTTC TTGGGCCACA 180 

TCCTAGACCT TCTGACCGAC CCAGGGAGTG GGGCCCAGGA AGCTGCATTT GACAGATATC 240 

CCCGTGTGAT CATCATGCAC ACAGGAGTGA GAGAACCAGT GTTCTCCCCG 3GCAGAAGGG 300 

AAGCTCGTGT GCAGGACACC TCACACCTCC TTTCCCATTC CCCTGCCAGG CTCTCCCTGC 3 60 

TGACATTGTT TTTGCGGGAG AGCTGTGAAT TCTGAAGATT AGGTTGCTTC 1 
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CTCCAGAAGT CCAGGCTGAG CCAAACCAAG CTTCAAGTTG TGCCTGGACT TGGAGAACCA 
GGAGGTGAGG GGACTGACTA CTTGAAGATC ACATGGAGGA GGAGTCTGAT CCAGGCCCAG 
GCACCAAGGA AAGGCCATGC AAGGACACAG GGAGAAGGGC AGCTGTCTGT AAGCCAGAAA 
GAGCCTTCAC TAGAAACCAA ATCAGCCAGA ACCTTCATCT TGGACTTTCC AGCCTTCAGA 
GATGTGAAAA AATAAATTTC TGTTGATTAA CCTAAAAAA 



Seq ID No: 163 Protein sequence: 
Protein Accession #: none found 



Seq ID NO: 164 DNA sequence 

Nucleic Acid Accession #: NM_020241.1 

Coding sequence: 4-1557 (underlined sequences correspond to start and stop codons) 



GCCAJGCAGA 
CTGGGGGG.CG 
GACTACCTGA 
GAAGGTGCTG 
GGGGACAGGG 
TACCAGAGGA 
GGCAAACAGG 
ACGCTCTTTG 
ACCCTGCAGC 
CACGCCAATG 
CTAGCCATTG 
AAACATGACT 
CATGTCTACT 
GTGTCCCGCG 
AAGCAGTGGA 
TTCTACTTCA 
GTCCTGGCCG 
GACCTGACAC 
TCCATCTGGA 
GCCCCCGGGA 
AAGACCCACC 
CGGACCCTGA 
GGCAACCAGA 
CGGCCCAATG 
AGGGTGTGTG 
CGACGCTGGG 
CCCCCCACTC 
AGGGCCTGCC 
CGGCGAAGGT 
GCCACCCGTC 



11 21 31 

I I I 

CCCCGCGAGC GTCCCCTCCC CGCCCGGCCC 
GAGCCGCCGC 
GGCAGCGGGC 
GTCCTGCGGG 
TTGGAGCCCC 



ACCACTATCC 
ACGACCTCAA 
ACAACCTCTA 
AGCTGACCTG 
AGGGCGAGTG 
TGTGCGGTTC 
CCGTCGGAGA 
TTGCCCTCTT 
ATGCTGTCAT 
CCAAGTGGTT 
TCTTCTTCCG 
TGGCCCGAGT 
CGTCCTTCCT 
ACGTGCTGCA 
TTTTTTCCAC 
AGGTGGCAGC 
CGCCGGTGCC 
TGCAGTACAA 
CTCTGATGGA 



CGTGTTTGTG 
CATCCAGCGA 
CCGCGTAGAG 
GAGATCTAAC 
TCGAAACTTC 
CAACGCCTTC 



CTCTGACGGG 
CTACCGCAGC 
CAAAGAGCCT 
GGAGATTGCG 
GTGCAAGAAC 
GAAGGCGCGG 
GGCTGTCACG 
GCCCAGCAAC 
TGTGTTTGAA 
GGAGGATCAG 
TGCCTCCAGC 



GCTGACTCGA 



CCGTTGTCTT 



AGGGACGTCT 
ACGATCGTGG 
GGCCCGGGGG 



ATGCTCTTCA 
CTCGGGGACA 
TACTTTGTCC 
ATGGAGTTTA 
GACGTGGGAG 
CTCAACTGCT 
GGCGTGGTCA 
AGCATCCCTG 
GGCCGCTTCC 
GTGCCTCGAC 
GCCTTGCCGG 
CCCTCGCTGG 
GTGGCTGTGG 
GAGGCGGGGA 
GGGCGTGTGT 
TGGCCCCAGC 
CCTCCGAGGT 



TCCTGCTTCT 
CGCTTAGCGT 
CCGGACGCCT 
TCAACAGGAC 
CCACGTCCAC 
TAAACGTGTG 
TGCTCCTTCG 
GCGCCAACTA 
GCTGCCCGTA 
CAGCTACTGT 
GGCCCACCCT 



ACTACCTGGA 
GCTCCCCCCG 
CTGTACCCGG 
GCCTCGGGGG 
GCTCGGCTGT 
GAGAGCAGAA 



51 

I 

GCTGCTGCTA 
GGCCCCCAGG 
GACCCCCGCA 
GCTGTTCATT 
GGAGCTGCGG 
TCGGATGAAG 
GGACGAGTCC 
CAGCATAGAC 
CGACCCCAAG 
TACCGACTTC 
GCGCACCGTG 
GTGGGGCAGC 



CGGTCCTCAA 



GGCCTGGGCG 1 



CGGAAGTCAC 
GGGTGGGGCC 
CCCTTGTGAC 



ATCGGCAGCA 
CCTCTGTAAA 
CTCCCCCCTC 



GCTGTCTAAA 
TACGGCCCCA 
TGACCTCCAG 



TTCAGGCAGG 
GGGCTTGGGG 
GGGTGGTGAG 
CTGACCATGC 



GAAGGTGGTG 780 
CGTGCTGGAG 840 
AGACTCCCAT 900 
CCGGCCCGTG 9S0 
CTGCGCCTTT 1020 
GTCCCCCGAG 10 80 
GTGCTQCGCA 1140 
CAACTTTGTC 1200 
CTGGATCCTG 1260 
CGGCCCCTGG 1320 
GTTCCTCGTC 1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 



PALLLLLLLL 
LRVNRTLFIG 
KVLLLRDEST 
LFTATVTDFL 
EFNYLEKVW 
WSLGGRPW 
PRPRPGCCAA 
AVDVGAGPWG 



ANVALFSDGM 
VYFFFREIAM 
YFNVLQAVTG 
IWTPVPEDQV 
TLMRHQLTRV 
VCVHERRSWW 

Seq ID NO: 
Nucleic Acid Accession S 
Coding sequence: 39-2705 



AIDAVIYRSL 
SRVARVCKND 
LAVFSTPSNS 
PGMQYNASSA 



31 
I 

3 PPPLSVAPRD 
EPPTSTELRY 
PVCANYSIDT 
GDRPTLRTVK 
VGGSPRVLEK 
IPGSAVCAFD 
LPDDILNFVK 
AGTVLKFLVR 



YLNHYPVFVG £ 
QRKLTWRSNP SDINVCRMKG 
LQPVGDNISG MARCPYDPKH 
HDSKWFKEPY FVHAVEWGSH 
QWTSFliKARL NCSVPGDSHF 
LTQVAAVFEG RFREQKSPES 
THPLMDEAVP SLGHAPWILR 
PNASTSGTSG RVCQVGHACR 



I I I I I I 

TCCGAGGCGT CACCTCCTCC TGTCGCCTGG CCCTCGCC AT G CAGACCCCG CGAGCGTCCC 
CTCCCCGCCC GGCCCTGCTG CTTCTGCTGC TGCTACTGGG GGGCGCCCAC GGCCTCTTTC 
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; GCCGCCGCTT AGCGTGGCCC CCAGGGACTA CCTGAACCAC TATCCCGTGT 180 

TTGTGGGCAG CGGGCCCGGA CGCCTGACCC CCGCAGAAGG TGCTGACGAC CTCAACATCC 240 

AGCGAGTCCT GCGGGTCAAC AGGACGCTGT TCATTGGGGA CAGGGACAAC CTCTACCGCG 300 

TAGAGCTGGA GCCCCCCACG TCCACGGAGC TGCGGTACCA GAGGAAGCTG ACCTGGAGAT 3 SO 

CTAACCCCAG CGACATAAAC GTGTGTCGGA TGAAGGGCAA ACAGGAGGGC GAGTGTCGAA 420 

ACTTCGTAAA GGTGCTGCTC CTTCGGGACG AGTCCACGCT CTTTGTGTGC GGTTCCAACG 4 80 

CCTTCAACCC GGTGTGCGCC AACTACAGCA TAGACACCCT GCAGCCCGTC GGAGACAACA 540 

TCAGCGGTAT GGCCCGCTGC CCGTACGACC CCAAGCACGC CAATGTTGCC CTCTTCTCTG 600 

ACGGGATGCT CTTCACAGCT ACTGTTACCG ACTTCCTAGC CATTGATGCT GTCATCTACC SSO 

GCAGCCTCGG GGACAGGCCC ACCCTGCGCA CCGTGAAACA TGACTCCAAG TGGTTCAAAG 720 

AGCCTTACTT TGTCCATGCG GTGGAGTGGG GCAGCCATGT CTACTTCTTC TTCCGGGAGA 780 

TTGCGATGGA GTTTAACTAC CTGGAGAAGG TGGTGGTGTC CCGCGTGGCC CGAGTGTGCA 840 

AGAACGACGT GGGAGGCTCC CCCCGCGTGC TGGAGAAGCA GTGGACGTCC TTCCTGAAGG 900 

CGCGGCTCAA CTGCTCTGTA CCCGGAGACT CCCATTTCTA CTTCAACGTG CTGCAGGCTG 9S0 

TCACGGGCGT GGTCAGCCTC GGGGGCCGGC CCGTGGTCCT GGCCGTTTTT TCCACGCCCA 1020 

GCAACAGCAT CCCTGGCTCG GCTGTCTGCG CCTTTGACCT GACACAGGTG GCAGCTGTGT 1080 

TTGAAGGCCG CTTCCGAGAG CAGAAGTCCC CCGAGTCCAT CTGGACGCCG GTGCCGGAGG 1140 

ATCAGGTGCC TCGACCCCGG CCCGGGTGCT GCGCAGCCCC CGGGATGCAG TACAATGCCT 1200 

CCAGCGCCTT GCCGGATGAC ATCCTCAACT TTGTCAAGAC CCACCCTCTG ATGGACGAGG 12S0 

CGGTGCCCTC GCTGGGCCAT GCGCCCTGGA TCCTGCGGAC CCTGATGAGG CACCAGCTGA 1320 

CTCGAGTGGC TGTGGACGTG GGAGCCGGCC CCTGGGGCAA CCAGACCGTT GTCTTCCTGG 1380 

GTTCTGAGGC GGGGACGGTC CTCAAGTTCC TCGTCCGGCC CAATGCCAGC ACCTCAGGGA 1440 

CGTCTGGGCT CAGTGTCTTC CTGGAGGAGT TTGAGACCTA CCGGCCGGAC AGGTGTGGAC 1500 

GGCCCGGCGG TGGCGAGACA GGGCAGCGGC TGCTGAGCTT GGAGCTGGAC GCAGCTTCGG 15 SO 

GGGGCCTGCT GGCTGCCTTC CCCCGCTGCG TGGTCCGAGT GCCTGTGGCT CGCTGCCAGC 1620 

AGTACTCGGG GTGTATGAAG AACTGTATCG GCAGTCAGGA CCCCTACTGC GGGTGGGCCC 1680 

CCGACGGCTC CTGCATCTTC CTCAGCCCGG GCACCAGAGC CGCCTTTGAG CAGGACGTGT 1740 

CCGGGGCCAG CACCTCAGGC TTAGGGGACT GCACAGGACT CCTGCGGGCC AGCCTCTCCG 1800 

AGGACCGCGC GGGGCTGGTG TCGGTGAACC TGCTGGTAAC GTCGTCGGTG GCGGCCTTCG 18 SO 

TGGTGGGAGC CGTGGTGTCC GGCTTCAGCG TGGGCTGGTT CGTGGGCCTC CGTGAGCGGC 1920 

GGGAGCTGGG CCGGCGCAAG GACAAGGAGG CCATCCTGGC GCACGGGGCG GGCGAGGCGG 1980 

TGCTGAGCGT CAGCCGCCTG GGCGAGCGCA GGGCGCAGGG TCCCGGGGGC CGGGGCGGAG 2 040 

GCGGTGGCGG TGGCGCCGGG GTTCCCCCGG AGGCCCTGCT GGCGCCCCTG ATGCAGAACG 2100 

GCTGGGCCAA GGCCACGCTG CTGCAGGGCG GGCCCCACGA CCTGGACTCG GGGCTGCTGC 2 ISO 

CCACGCCCGA GCAGACGCCG CTGCCGCAGA AGCGCCTGCC CACTCCGCAC CCGCACCCCC 2220 

ACGCCCTGGG CCCCCGCGCC TGGGACCACG GCCACCCCCT GCTCCCGGCC TCCGCTTCAT 2280 

CCTCCCTCCT GCTGCTGGCG CCCGCCCGGG CCCCCGAGCA GCCCCCCGCG CCTGGGGAGC 2340 

CGACCCCCGA CGGCCGCCTC TATGCTGCCC GGCCCGGCCG CGCCTCCCAC GGCGACTTCC 2400 

CGCTCACCCC CCACGCCAGC CCGGACCGCC GGCGGGTGGT GTCCGCGCCC ACGGGCCCCT 2460 

TGGACCCAGC CTCAGCCGCC GATGGCCTCC CGCGGCCCTG GAGCCCGCCC CCGACGGGCA 2520 

GCCTGAGGAG GCCACTGGGC CCCCACGCCC CICCGGCCGC CACCCTGCGC CGCACCCACA 2580 
CGTTCAACAG CGGCGAGGCC CGGCCTGGGG ACCGCCACCG CGGCTGCCAC G 
GCACAGACTT GGCCCACCTC CTCCCCTATG GGGGGGCGGA CAGGACTGCG C 

CCTAGGCCGG GGGCCCCCCG ATGCCTTGGC AGTGCCAGCC ACGGGAACCA GGAGCGAGAG 27S0 

ACGGl'GCCAG AACGCCGGGG CCCGGGGCAA CTCCGAGTGG GTGCTCAAGT CCCCCCCGCG 2320 

ACCCACCCGC GGAGTGGGGG GCCCCCTCCG CCACAAGGAA GCACAACCAG CTCGCCCTCC 2 880 

CCCTACCCGG GGCCGCAGGA CGCTGAGACG GTTTGGGGGT GGGTGGGCGG GAGGACTTTG 2 940 

CTATGGATTT GAGGTTGACC TTATGCGCGT AGGTTTTGGT TTTTTTTGCA GTTTTGGTTT 3 000 

CTTTTGCGGT TTTCTAACCA ATTGCACAAC TCCGTTCTCG GGGTGGCGGC AGGCAGGGGA 3 060 

GGCTTGGACG CCGGTGGGGA ATGGGGGGCC ACAGCTGCAG ACCTAAGCCC TCCCCCACCC 3120 

CTGGAAAGGT CCCTCCCCAA CCCAGGCCCC TGGCGTGTGT GGGTGTGCGT GCGTGTGCGT 3180 

GCCGTGTTCG TGTGCAAGGG GCCGGGGAGG TGGGCGTGTG TGTGCGTGCC AGCGAAGGCT 3240 

GCTGTGGGCG TGTGTGTCAA GTGGGCCACG CGTGCAGGGT GTGTGTCCAC GAGCGACGAT 3300 

CGTGGTGGCC CCAGCGGCCT GGGCGTTGGC TGAGCCGACG CTGGGGCTTC CAGAAGGCCC 33S0 

GGGGGTCTCC GAGGTGCCGG TTAGGAGTTT GAACCCCCCC CACTCTGCAG AGGGAAGCGG 3420 

GGACAATGCC GGGGTTTCAG GCAGGAGACA CGAGGAGGGC CTGCCCGGAA GTCACATCGG 3480 
CAGCAGCTGT CTAAAGGGCT TGGGGGCCTG GGGGGCGGCG AAAG 



1 11 21 31 41 51 

I 1 I I I I 

MQTPRASPPR PALLLLLLLL GGAHGLFPED PPPLSVAPRD YLNHYPVFVG SGPGRLTPAE 
GADDLNIQRV LRVNRTLFIG DRDNLYRVEL EPPTSTELRY QRKLTWRSN? SDINVCRMKG 
KQEGECRNFV KVLLLRDEST LFVCGSNAFN PVCANYSIDT LQPVGDN1SG MARCPYDPKH 
AKVALFSDGM LFTATVTDFL AIDAVIYRSL GDRPTLRTVK HDSKWFKEPY FVHAVEWGSH 
VYFFFREIAM EFNYLEKWV S RVARVCKND VGGSPRVLEK QWTSFLKARL NCSVPGDSHF 
YFNVLQAVTG WSLGGRPW LAVFSTPSNS IPGSAVCAFD LTQVAAVFEG RFREQKSPES 
IWTPVPEDQV PRPRPGCCAA PGMQYNASSA LPDDILNFVK THPLMDEAVP SLGHAPWILR 
TLMRHQLTRV AVDVGAGPWG NQTWFLGSE AGTVLKFLVR PNASTSGTSG LSVFLEEFET 
YRPDRCGRPG GGETGQRLLS LELDAASGGL LAAFPRCWR VPVARCQQYS GCMKNCIGSQ 
DPYCGWAPDG SCIFLSPGTR AAFEQDVSGA STSGLGDCTG LLRA3LSEDR AGLVSVNLLV 
TSSVAAFWG AWSGFSVGW FVGLRERREL ARRKDKEAIL AHGAGEAVLS VSRLGERRAQ 
GPGGRGGGGG GGAGVPPEAL LAPLMQNGWA KATLIiQGGPH DLDSGLLPTP EQTPLPQKRL 
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PTPHPHPHAL GPRAWDHGHP LLPASASSSL LLLAPARAPE QPPAPGEPTP DGRLYAARPG 780 
RASHGDFPLT PHASPDRRRV VSAPTGPLDP ASAADGLPRP WSPPPTGSLR RPLGPHAPPA 840 
AT IiRRTHT FN SGEARPGDRH RGCHARPGTD XiAHLLPYGGA DRTAPPVP 



Seq ID NO: 16S DNA sequence 
Nucleic Acid Accession #: AW205664 

Coding sequence: 1-135 (underlined sequences correspond to start and stop codons) 



I 11 21 31 41 51 

II III 
CGGCACGAGG AGAACAGGGG CCTCTGCCTC AGTTTGCCCG GGAGCCAGCC AGGGCCCATC 
CTAATTTGGA GCACAGTCTT CCCGGTGCCT AGACATGCCA AGGCCCCTCC CACGTGGTAC 
ACCCTCTCCG TT TAG TACCT GACCACCTGT TTCAAAACGC AGGTGTTTCT GGTTTAGAAA 
CTTGGAAGGC GGAATGTGTT TTCGTGTCTT CTAGGAAGGG TCTGCTGAGG ACCAGACCAC 
GTAAGCCTGA GTGGATCCTG ACTCAGCTGC AGCCCTTACC TGCCTCGTGC TGATGATCTA 
TGCATGGCGT TATGTAGATC ACGTGCGGCA GAGACAGCCA CTGTCCTGTG TGCGGGTTTT 
TAAAACAGCT GCCCTGGATG AAACGGAATA AACCAGTGAT GCTAAAAAAA AAAAAAAAAA 



1 11 • 21 31 41 51 

I I I I I I 

RHEENRGLCL SLPGSQPGPI LIWSTVFPVP RHAKAPPTWY TLSV 



Seq ID NO: 170 DNA sequence 
Nucleic Acid Accession #: AB033100 

Coding sequence: 32-2623 (underlined sequences correspond to start and stop codons) 



AGGTCTGGGG TCCTGAGGCT GCTGGCAGAC 
GACGGTCTCG GCAGGCACCC CATTTGAGGG 
GCACTCCGTC AGCATCCACT CCTTCCAGAG 
CATCATCCCC AACAAGGTGG CCCCTGTTGT 
GATCCATGAT GAGCTGCTCA AGGCTCATTA 
TGAGCACTAC CTGGTGCAAG GAGCTCAGGC 
GGATGTCACT GAGAAGATGG ATGTGCTGGG 
CCGGCAGGTG CAGGGTGGGC TCACTGTGTT 
CAGGCGGGTC CTCCAGAAAC TCCAGAAGGA 
GCGGGAGGAA MCTGTGCTTT TCCTGCGTGC 
AG AC AAG C AG AACCTTCATG AGAACCTCCA 
CCTGGAGCTG GCCATCCGGA AAGAGATCCA 
CCATGTGTAC CATAACACCG AGGACCTGTG 
TGAGGACGAC TTGCATGTGA CGGAGGAGGT 
CTACAGGTAC CACCGCCTGC CCCTGCCCGA 
CGCCTTTGTC AGTGTTCTCC GGGAGACCCC 
GCCTCCCCCA GCCCTCGTCT TCAGCTGCCA 
GGTCCTGGGC ACCCTCATCC TGCTTCACCG 
CCCCACGCAG GCCAAGCCCC TGCCTATGGA 
CATGGTGCCC CAGGGAAGGA GGATGGTGGA 
CGAGTTGCAT GACCTGAAAG AAGTGGTCTT 
ACCGGAGAGC CCAGCCCAGG GAAGCGGCAG 
GAGCCTGGAG CGATACTTCT ACCTGATCCT 
GCTGGCCTTT GCCCTCAGTT TCAGCCGCTG 
GCCCGTGACG CTGAGCTCAG CAGGCCCTGT 
CCTACGGGAG GACGATCTGG TCTCCCCGGA 
GGCCAACTTC CGGCGGGTGC CCCGCATGCC 
GGCCCTGGGG AGCATCCTGG CCTACCTGAC 
CTGGGTGAGC CTTCGGGAGG AGGCCGTGTT 
GTGGCCTGGG CCCCCTGTGG CTCCTGACCA 
CCATCTAAGC GAGCCTCCCC CAGGCAAGGA 
CCTTACCATG CAGGAGGTCT TCAGCCAGCA 
CCGCATCCCC ATGCCGGACT TCTGTGCCCC 
GGCCCTGCGG GCCGCCCTCT CCAAGGACCC 
CGGCCAGGGC CGTAC C AC AA CTGCGATGGT 
AGGCTTCCCC GAGGTGGGTG AGGAGGAGCT 
GGGTGAATTT CAGGTAGTAA TGAAGGTGGT 
GAAGGAGGTG GACGCAGCGC TGGACACTGT 
CCTGCGGGAG ATCATCATCT GCACCTACCG 
AATGCGGAGG CTGCAGCTGC GGAGCCTGCA 



31 41 51 

I I I 

TATGGGTACA ACGGCCAGCA CAGCCCAGCA SO 

CCTACAGGGC AGTGGCACGA TGGACAGTCG 120 

CACTAGCTTG CATAACAGCA AGGCCAAGTC 180 

GATCACGTAC AACTGCAAGG AGGAGTTCCA 240 

CACGTTGGGC CGGCTCTCGG ACAACACCCC 3 DO 

CTTACCCCAG GGCCGCTACT TCCTGGTGCG 3 50 

CACCGTGGGA AGCTGTGGGG CCCCCAACTT 420 

CGGCATGGGA CAGCCCAGCC TCTTAGGGTT 4 30 

CGGACATAGG GAGTGTGTCA TCTTCTGTGT 540 

AGATGAGGAC TTTGTGTCCT ACACACCTCG 500 

GGGCCTTGGA CCCGGGGTCC GGGTGGAGAG 650 

CGACTTTGCC CAGCTGAGCG AGAACACATA 720 

GGGGGAGCCC CATGCTGTGG CCATCCATGG 780 

GTACAAGCGG CCCCTCTTCC TGCAGCCCAC 840 

GCAAGGGAGT CCCCTGGAGG CCCAGTTGGA 900 

CAGCCTGCTG CAGCTCCGTG ATGCCCACGG 960 

GATGGGCGTG GGCAGGACCA ACCTGGGCAT 1020 

CAGTGGGACC ACCTCCCAGC CAGAGGCTGC 10 80 

GCAGTTCCAG GTGATCCAGA GCTTTCTCCG 1140 

AGAGGTGGAC AGAGCCATCA CTGCCTGTGC 1200 

GGAAAACCAG AAGAAGTTAG AAGGTATCCG 1250 

CCGACACAGC GTCTGGCAGA GGGCGCTGTG 1320 

GTTTAACTAC TACCTTCATG AGCAGTACCC 13 80 

GCTGTGTGCC CACCCTGAGC TGTACCGCCT 1440 

GGCTCCGAGG GACCTCATCG CCAGGGGCTC 1500 

CGCGCTCAGC ACTGTCAGAG AGATGGATGT 1560 

CATCTACGGC ACGGCCCAGC CCAGCGCCAA 1620 

GGACGCCAAG AGGAGGCTGC GGAAGGTTGT 16 B0 

GGAGTGTGAC GGGCACACCT ACAGCCTGCG 1740 

GCTGGAGACC CTGGAGGCCC AGCTGAAGGC 1800 

GGGCCCCCTG ACCTACAGGT TCCAGACCTG 1860 

CCGCAGGGCC TGTCCTGGCC TCACCTACCA 1920 

CCGAGAGGAG GACTTTGACC AGCTGCTGGA 1980 

AGGCACIGGC TTCGTGTTCA GC7GCCTCAG 2 040 

GGTGGCTGTC CTGGCCTTCT GGCACATCCA 2100 

CGTGAGTGTG CCTGATGCCA AGTTCACTAA 2160 

GCAGCTGCTA CCCGATGGGC ACCGTGTGAA 2220 

CAGCGAGACC ATGACGCCCA TGCACTACCA 2280 

CCAGGCGAAG GCAGCGAAAG AGGCGCAGGA 2340 

GTACTTGGAG CGCTATGTCT GCCTGATTCT 2400 
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CTTCAACGCG TACCTCCACC TGGAGAAGGC CGACTCCTGG CAGAGGCCCT TCAGCACCTG 2460 

GATGCAGGAG GTGGCATCGA AGGCTGGCAT CTACGAGATC CTTAACGAGC TGGGCTTCCC 2520 

CGAGCTGGAG AGCGGGGAGG ACCAGCCCTT CTCCAGGCTG CGCTACCGGT GGCAGGAGCA 2580 

GAGCTGCAGC CTCGAGCCCT CTGCCCCCGA GGACTTGCTG TAGGGGGCCT TACTCCCTGT 2640 

CCCCCCACCC ACAGGGCCCC ACGCAGGCCT GGGGTGTCTG AGGTGCTCTT GGCTGGGAGC 2700 

GGCCCTGAGG GGTGCTGGCC TTGAAATGAT TCCCCCACTT CCTGGAGAGA CTGAGCGGAG 27SO 

TTGGGAGCCT TTTTAGAAAG AACTTTTTAT AGGACAGGGA GACAGCACAG CCATCCCTTG 2 820 

CAAACCACCA AGGTGTGTGG CTGACCTCCA GGGAGGAGCA CTCACTGGAG TGCTCACAAG 2880 

GTGCACACTG CTGTGTGTAC CTTGCAGACA GGCCGGCGTT CAGCCTCCAA GGGGCTCACT 2940 

CCCCCAGTTG CCAAACACTG TGGATCTCTC TGTCCTCTTC TCCCCTCTCT CAGATTGGCC 3000 

TGGCAGCCCC TGGCACAGAG CAGACCCGGC CACTGGTAGC TCCCCACTTC CTTACTCCTG 30SO 

CTGCTCTGCC ATTGCCGCTC CCCTTCTTGC TGCCCAAGCA CTGCCCTCGG GCGTCTGGCA 3120 

GCCTGAGGTG GGTGGAGGGG ACAGTGTTCT GGATAGATCT ATTATGTGAA AGGCAGCTTC 3180 

ACCCAGTTTT CTGGACTCTC ATGCCCCCAT CTCCGACCTG GGAGACTTCA GGAATGACAA 3240 

CCTACCCAGC CTGGTGGGGC TGGCAGGATG GTGGAGGTTT CTCAAGGAGC TGGAGACTTC 3300 

AGGGAGCCCC TCTCATGGGG AGGAAAGAGC TTCCAGGGGG CGAACGCAGC ACAGAGGAAG 3 3 SO 

AGGCCTGCTC CACTTGTCTG GGAACCTGGG CAGGAGGCAC AGAGGAAGCC AAGGCCTGGA 3420 

GCTGCAGGTC CCCCGGCATC TCTCTCTGTC CCGGCAGCCC AGGATGGCCT GGTGCCCCCA 3480 

CCTGCTGCAG CAGGAGCCCC AAGGAGTGCT AGCTGAGGGT GGTTGCTGGG GTGGTCCTCA 3540 

TGGACAGTGA GGTGTGCAAG GGTGCACTGA GGGTGGTGGG AGGGGATCAC CTGGGTTCCA 3600 

GGCCATCCTT GCTGAGCATC TTTGAGCCTG CCITCCGGTG GGAGCAGAAA AGGCCAGACC 3650 

CTGCTGAGTT AGAGGCTGCT GGGATCCACT GTTTCCACAC AGCGGGAAGG CTGCTGGGAA 3 720 

CAGGTGGCAG AGAAGTGCCA TGTTTGCGTT GAGCCTTGCA GCTCTTCCAG CTGGGGACTG 3780 

GTGCTTGCTG AAACCCAGGA GCTGAACAGT GAGGAGGCTG TCCACCTTGC TTGGCTCACT 3840 

GGGACCAGSA AAGCCTGTCT TTGGTTAGGC TCGTGTACTT CTGCAGGAAA AAAAAAAAAG 3900 

GATGTGTCAT TGGTCATGAT ATTTGAAAAG GGGAGGAGGC CGAAGTTGTT CCCATTTATC 3 950 

CAGTATTGGA AAATATTTGA CCCCCTTGGC TGAATTCTTT TGCAGAACTA CTGTGTGTCT 4020 

GTTCACTACC TTTTCAGGTT TATTGTTTTT ATTTTTGCAT GAATTAAGAC GTTTTAATTT 4080 

CTTTGCAGAC AAGGTCTAGA TGCGGAGTCA GAGATGGGAC TGAATGGGGA GGGATCCTTT 4140 

GTGTTCTCAT GGTTGGCTCT GACTTTCAGC TGTGTTGGGA CCACTGGCTG ATCACATCAC 4200 

CTCTCTGCCT CAGTTTCCCC ATCTGTAAAA TGGGAGAATA ATACTTGCCT ACCTACCTCA 42 SO 

CRGGGGTGTT GTGAGGATTC ATTTGTGATT TTTTTTTTTT TTTTTGTACA GAGCTTTTAA 4320 
GCATTAAAAA CAGCTAAATG TG 

Seq ID No: 171 Protein sequence: 
Protein Accession #: BAA865 8 8.1 



MGTTASTAQQ TVSAGTPFEG LQGSGTMDSR HSVSIHSFQS TSLHNSKAKS IIPNKVAPW SO 

ITYNCKEEFQ IHDELLKAHY TLGRLSDNTP EHYLVQGAQA LPQGRYFLVR DVTEKMDVLG 120 

TVGSCGAPNF ROVQGGLTVF GMGQPSLLGF RRVLQKLQKD GHRECVIFCV REEVLFLRAD 180 

EDFVSYTPRD KQNLHENLQG LGPGVRVESL ELAIRKEIHD FAQLSENTYH VYHNTEDLWG 240 

EPHAVAIHGE DDLHVTEEVY KRPDFLQPTY RYHRLPLPEQ GSPLEAQLDA FVSVLRETPS 3 00 

LLQLRDAHGP PPALVFSCQM GVGRTNLGMV LGTLILLHRS GTTSQPEAAP TQAKPBPMEQ 3 SO 

FQVIQSFLRM VPQGRRMVEE VDRAITACAE LHDLKEWLE NQKKLEGIRP ESPAQGSGSR 420 

HSVWQRALWS LERYFYLILF NYYLHEQYPL AFALSFSRWL CAHPELYRLP VTLSSAGPVA 4 80 

PRDLIARGSL REDDLVSPDA LSTVREMDVA NFRRVPRMPI YGTAQPSAKA LGSILAYLTD 540 

AKRRLRKWW VSLREEAVLE CDGHTYSLRW PGPPVAPDQD ETLEAQLKAH LSEPPPGKEG S00 

PLTYRFQTCL TMQEVFSQHR RACPGLTYHR IPMPDFCAPR EEDFDQLLEA LRAALSKDPG 6 SO 

TGFVFSCLSG QGRTTTAMW AVLAFWHIQG FPEVGEEELV SVPDAKFTKG EFQWMKWQ 720 

LLPDGHRVKK EVDAALDTVS ETMTPMHYHL REIIICTYRQ AKAAKEAQEM RRLQLRSLQY 780 

LERYVCLILF NAYLHLEKAD SWQRPFSTWM QEVASKAGIY EILNEIjGFPE LESGEDQPFS 840 
RLRYRWQEQS CSLEPSAPED LL 

Seq ID NO: 172 DMA sequence 

Nucleic Acid Accession #: AK021806.1 

Coding sequence: 1-645 (underlined sequences correspond to start and 3top codons) 



1 11 21 

I I I 

ACTGTGCTTT TCCTGCGTGC AGATGAGGAC 
AACCTTCATG AGAACCTCCA GGGCCTTGGA 
GCCATCCGGA AAGAGATCCA CGACTTTGCC 
CATAACACCG AGGACCTGTG GGGGGAGCCC 
TTGCATGTGA CGGAGGAGGT GTACAAGCGG 
CACCGCCTGC CCCTGCCCGA GCAAGGGAGT 
AGTGTTCTCC GGGAGACCCC CAGCCTGCTG 
GCCCTCGTCT TCAGCTGCCA GATGGGCGTG 
ACCCTCATCC TGCTTCACCG CAGTGGGACC 
GCCAAGCCCC TGCCTATGGA GCAGTTCCAG 
CAGGGAAGGA GGATGGTGGA AGAGGTGGAT 
AGTTTTCTGG ACTCTCATGC CCCCATCTCC 
CCCAGCCTGG TGGGGCTGGC AGGATGGTGG 
AGCCCCTCTC ATGGGGAGGA AAGAGCTTCC 



TTTGTGTCCT ACACACCTCG AGACAAGCAG SO 

CCCGGGGTCC GGGTGGAGAG CCTGGAGCTG 120 

CAGCTGAGCG AGAACACATA CCATGTGTAC 180 

CATGCTGTGG CCATCCATGG TGAGGACGAC 240 

CCCCTCTTCC TGCAGCCCAC CTACAGGTAC 3 00 

CCCCTGGAGG CCCAGTTGGA CGCCTTTGTC 3 SO 

CAGCTCCGTG ATGCCCACGG GCCTCCCCCA 420 

GGCAGGACCA ACCTGGGCAT GGTCCTGGGC 480 

ACCTCCCAGC CAGAGGCTGC CCCCACGCAG 540 

GTGATCCAGA GCTTTCTCCG CATGGTGCCC 600 

AGATCTATTA TGTGAAAGGC AGCTTCACCC 6 SO 

GACCTGGGAG ACTTCAGGAA TGACAACCTA 720 

AGGTTTCTCA AGGAGCTGGA GACTTCAGGG 780 

AGGGGGCGAA CGCAGCACAG AGGAAGAGGC 840 



261 



WO 02/079492 



PCT/US02/04915 



CAGGTCCCCC 
CTGCAGCAGG 
CAGTGAGGTG 
ATCCTTGCTG 
TGAGTTAGAG 
TGGCAGAGAA 
TTGCTGAAAC 
CCAGGAAAGC 
TGTCATTGGT 
ATTGGAAAAT 
ACTACCTTTT 
GCAGACAAGG 
TCTCATGGTT 
CTGCCTCAGT 
GGTGTTGTGA 
TAAAAACAGC 



TGTCTGGGAA 
GGCATCTCTC 
AGCCCCAAGG 
TGCAAGGGTG 
AGCATCTTTG 
GCTGCTGGGA 
GTGCCATGTT 
CCAGGAGCTG 



CATGATATTT 
ATTTGACCCC 
CAGGTTTATT 
TCTAGATGCG 
GGCTCTGACT 
TTCCCCATCT 
GGATTCATTT 
TAA^TGTG 



CCTGGGCAGG 
TCTGTCCCGG 
AGTGCTAGCT 
CACTGAGGGT 
AGCCTGCCTT 
TCCACTGTTT 
TGCGTTGAGC 
AACAGTGAGG 
TTAGGCTCGT 
GAAAAGGGGA 



AGGCACAGAG 
CAGCCCAGGA 
GAGGGTGGTT 
GGTGGGAGGG 
CCGGTGGGAG 
CCACACAGCG 
CTTGCAGCTC 
AGGCTGTCCA 
GTACTTCTGC 



GAAGCCAAGG 
TGGCCTGGTG 
GCTGGGGTGG 
GATCACCTGG 
CAGAAAAGGC 
GGAAGGCTGC 
TTCCAGCTGG 



GTTTTTATTT 
GAGTCAGAGA 
TTCAGCTGTG 
GTAAAATGGG 
GTGATTTTTT 



TTCTTTTGCA 
TTGCATGAAT 
TGGGACTGAA 
TTGGGACCAC 
AGAATAATAC 
TTTTTTTTTT 



AGGAAAAAAA 
GTTGTTCCCA 
3AACTACTGT 
TAAGACGTTT 



GTTCCAGGCC 
CAGACCCTGC 
TGGGAACAGG 
GGACTGGTGC 
CTCACTGGGA 
AAAAAGGATG 
TTTATCCAGT 
GTGTCTGTTC 
TAATTTCTTT 
TCCTTTGTGT 
CATCACCTCT 
ACCTCACGGG 
TTTTAAGCAT 



1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 



11 



21 



31 



41 



51 



I I I I I I 

TVLFLRADED FVSYTPRDKQ NLHENLQGLG PGVRVESLEL AIRKEIHDFA QLSENTYHVY 
HNTEDLWGEP HAVAIHGEDD LHVTEEVYKR PLFLQPTYRY HRLPLPEQGS PLEAQLDAFV 
SVLRETPSLL QLRDAHGPPP ALVFSCQMGV GRTNLGMVLG TLILLHRSGT TSQPEAAPTQ 
AKPLPMEQFQ VIQSFLRMVP QGRRMVEEVD RS1M 



Seq ID 'NO: : 

Nucleic Acid Accession #: NM_01658C 
Coding sequence: 1212-4766 (underlined sequences correspond t 



■ start and stop codons) 



GGGAAGCGGG 
AGACCATGGG 
. TCCAGAGGTA 
AACGTCACTG 
TTTAATCCTC 
AGATGAGGTA 
AGTTGCAAAG 
CAGCTGCTTT 
ACAAGATTGG 
GAGAGTTGGG 
ACGTGCAAGC 
CCCTCAGCAG 
CCGGCCTCAT 
TTGGCATTGA 
TGCTTCTCTG 
ACTGGGTCAG 
GTTCCCCAGT 
CTTCCAATGA 
GACAAGGTTG 
AACACTGGAG 
TGGCGGTAAG 
ACTTATTTCT 
CAGAGGAAGT 
AGAGGCGGAG 
TTCAGGTGGA 
TGTGCCGACA 
CTCTGATCCA 
AAGGCGAGCA 
ACAGAGCTCT 
GTGAGCACTT 
TAGTGGTGAA 
ATGACAATGG 
CCAATGACAA 
CTGCACCTGG 
GGGAGGTGGA 
TTGATGCCAA 
CCTACGAGGT 



AGGAGAGCCA 
CACCCTCATA 
GCCATAGGGT 
CCTGTGACTC 
ACAGTTTCCT 
ATGGAGGCCC 
TCAGAATTTG 
CCAGTGAGAC 
GAAAAAGACA 



I 

CACGGTCAAG 
AGTCAGTGTG 
GTGACAAGTT 
GGGGCCAGGC 
GCTGAAAGGG 
AGGAAAGTTA 
AACTCAGGCA 
AAAAACGGGT 



CCGCCTTGCT 
CCTCCTCAGC 
AACTCACAGC 
TTTCCCCAGC 
GTCCCTCTTA 
AGGGGGTCTC 



TGCCGGCAGC 
GACTTGTCCA 
CATGATGCAA 
TTTAGGGGAT 
GCCATCTGGT 
GCAAGCTGGG 
CTCTGAGGAA 
GTGGGATCCC 
TGTGGAGATC 
GGAGCTGGAA 
TGACCCAGAC 
TGCCTTGGAT 
GGAGCTGGAC 
GAACCCCCCC 
TAGCCCTGCG 
TACGCTTCTC 
GTTCTTCCTC 
GACAGGCCAG 
GGATGTTCAG 



CCTCCCTTTT 
CTAGCAACTG 
GACCAGCTCT 
CCCTGGCAAT 
GCCTTGGGGT 



I 

TTGCACAGGT 
GGCAGGGACT 
GTGCAGATTA 
CCAGGCCAAA 
CTACTATTCT 
AGTGACTTGT 
GTTTACCTCT 
GATCAGGGCA 
GGGAACAATG 
ACATGACTTG 
TGCTCACTTG 
CAGCTTCCCA 
TTAGCTTCAC 
CCTGTAGGTG 
TTTGGCAGTC 
TGTCCATCAT 



CCTGCTTGTT 
GGAGCACGGG 
AACTCTAGTG 
TTCTGCTGGG 
TGACCACTCT 
GGAAGCTGTC 
AGGTGTTGCA 



TGCCTGGTTT 
CAAGTGCTGG' 
ATCTCTGAGA 
ACAGGCCCTA 
GTCATTGTGG 
AGGGAAATCC 
AAGTCAGGTA 
TTTGCTGAGA 
ATAAAACTGA 



ACATCAATGA 
GCGCCTCTCT 
ACACCCTGCA 
GCCCTGATGA 
ATTCATTTTT 
CCAGCTTGGT 
GTTCACTGGC 
CCGCCACAGA 



TCTTGCAGCT 
GCCCCAGGGC 
CAACACTCAC 
GCCCTTCCTA 
TACTCCCATC 
CCCAGATGAC 
GATGGCTGCT 
GAGTCAAGAC 
GGGGAAAAGA 
CCCGGGAGGG 
GTGATGCAGA 
TCTGGCCTGT 
AGGAAATGCT 
GGGTTTCCAT 
AGAGTCCCAC 
TTGCTGAAGT 
CGTTTGGACA 
TCATAATCAT 
GCTGATCAAG 
GTTTCTGAAT 
GCTTTTGGGG 
CACGGTGAAA 
CCAGGAACTG 
GCTGCCTCAG 
GCGGCTGGAT 
GCTTGCCACA 
CCACCAGCCA 
GCGAACCCGG 
CACCTACACT 
GACCAAACAT 
TGATCTGGTG 
CAAGGTCAAC 
ACTGGAAATC 
CCCTGACCAA 
GGTGCTGGAC 
AGACTATGAA 
TCCTATCCCA 



TCTGGAATCA 



CCCTTGCAAT 
CATCATTTCG 
CCCACTCTAC 
ACCGCTGGTA 



AGAGAGGTAA 
TAGGAACAAA 
GCATCAGTCC 
GGCTCCCTTT 
CCCCGGAGCC 
CTTTCTCTAA 
AGGAAAAAGC 
ATCCTGCTCA 
GGACCAACTA 
TCTCAGATCG 
CTAACTATGG 
CCATCCAGGA 
CTAGCCCACT 
CCAGGTGGCT 
TACCAAGTGT 
GGCCGGGAGG 
GCGCTCCCCA 
CGAGAGCAGC 
GGGGATTTGG 
CGGTTTCCCA 
ATCCCCCTGG 
CTGTCTCCCA 
GCAGAACTCA 
TTAACTGCCT 
GTCTTGGACT 
CAAGAAGATG 
GGCCCCAATG 
ACCTTCAGTA 
AAGAACCCTG 
GCCCATTGCA 



1860 
1920 
1980 
2040 
2100 
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AAGTTCTCAT CAAGGTTCTG GATGTCAATG ACAACATCCC AAGCATCCAC GTCACATGGG 2280 

CCTCCCAGCC ATCACTGGTG TCAGAAGCTC TTCCCAAGGA CAGTTTTATT GCTCTTGTCA 2340 

TGGCAGATGA CTTGGATTCA GGACACAATG GTTTGGTCCA CTGCTGGCTG AGCCAAGAGC 24 00 

TGGGCCACTT CAGGCTGAAA AGAACTAATG GCAACACATA CATGTTGCTA ACCAATGCCA 2460 

5 CACTGGACAG AGAGCAGTGG CCCAAATATA CCCTCACTCT GTTAGCCCAA GACCAAGGAC 2S2 0 

TCCAGCCCTT ATCAGCCAAG AAACAGCTCA GCATTCAGAT CAGTGACATC AACGACAATG 25 80 

CACCTGTGTT TGAGAAAAGC AGGTATGAAG TCTCCACGCG GGAAAACAAC TTACCCTCTC 2640 

TTCACCTCAT T AC CAT CAAG GCTCATGATG CAGACTTGGG CATTAATGGA AAAGTCTCAT 2700 

ACCGCATCCA GGACTCCCCA GTTGCTCACT TAGTAGCTAT TGACTCCAAC ACAGGAGAGG 2760 

10 TCACTGCTCA GAGGTCACTG AACTATGAAG AGATGGCCGG CTTTGAGTTC CAGGTGATCG 2820 

CAGAGGACAG CGGGCAACCC ATGCTTGCAT CCAGTGTCTC TGTGTGGGTC AGCCTCTTGG 2 8 30 

ATGCCAATGA TAATGCCCCA GAGGTGGTCC AGCCTGTGCT CAGCGATGGA AAAGCCAGCC 2940 

TCTCCGTGCT TGTGAATGCC TCCACAGGCC ACCTGCTGGT GCCCATCGAG ACTCCCAATG 3000 

GCTTGGGCCC AGCGGGCACT GACACACCTC CACTGGCCAC TCACAGCTCC CGGCCATTCC 3 060 

15 TTTTGACAAC CATTGTGGCA AGAGATGCAG ACTCGGGGGC AAATGGAGAG CCCCTCTACA 3120 

GCATCCGCAG TGGAAATGAA GCCCACCTCT TCATCCTCAA CCCTCATACG GGGCAGCTGT 3180 

TCGTCAATGT CACCAATGCC AGCAGCCTCA TTGGGAGTGA GTGGGAGCTG GAGATAGTAG 3240 

TAGAGGACCA GGGAAGCCCC CCCTTACAGA CCCGAGCCCT GTTGAGGGTC ATGTTTGTCA 33 00 

CCAGTGTGGA CCACCTGAGG GACTCAGCCC GCAAGCCTGG GGCCTTGAGC ATGTCGATGC 3360 

20 TGACGGTGAT CTGCCTGGCT GTACTGTTGG GCATCTTCGG GTTGATCCTG GCTTTGTTCA 3420 

TGTCCATCTG CCGGACAGAA AAGAAGGACA ACAGGGCCTA CAACTGTCGG GAGGCCGAGT 3480 

CCACCTACCG CCAGCAGCCC AAGAGGCCCC AGAAACACAT TCAGAAGGCA GACATCCACC 3 540 

TCGTGCCTGT GCTCAGGGGT CAGGCAGGTG AGCCTTGTGA AGTCGGGCAG TCCCACAAAG 3600 

ATGTGGACAA GGAGGCGATG ATGGAAGCAG GCTGGGACCC CTGCCTGCAG GCCCCCTTCC 3660 

25 ACCTCACCCC GACCCTGTAC AGGACGCTGC GTAATCAAGG CAACCAGGGA GCACCGGCGG 3720 

AGAGCCGAGA GGTGCTGCAA GACACGGTCA ACCTCCTTTT CAACCATCCC AGGCAGAGGA 3780 

ATGCCTCCCG GGAGAACCTG AACCTTCCCG AGCCCCAGCC TGCCACAGGC CAGCCACGTT 3840 

CCAGGCCTCT GAAGGTTGCA GGCAGCCCCA CAGGGAGGCT GGCTGGAGAC CAGGGCAGTG 3 900 

AGGAAGCCCC ACAGAGGCCA CCAGCCTCCT CTGCAACCCT GAGACGGCAG CGACATCTCA 3 960 

30 ATGGCAAAGT GTCCCCTGAG AAAGAATCAG GGCCCCGTCA GATCCTGCGG AGCCTGGTCC 4020 

GGCTGTCTGT GGCTGCCTTC GCCGAGCGGA ACCCCGTGGA GGAGCTCACT GTGGATTCTC 4 080 

CTCCTGTTCA GCAAATCTCC CAGCTGCTGT CCTTGCTGCA TCAGGGCCAA TTCCAGCCCA 4140 

AACCAAACCA CCGAGGAAAT AAGTACTTGG CCAAGCCAGG AGGCAGCAGG AGTGCAATCC 4200 

CAGACACAGA TGGCCCAAGT GCAAGGGCTG GAGGCCAGAC AGACCCAGAA CAGGAGGAAG 4250 

35 GGCCTTTGGA TCCTGAAGAG GACCTCTCTG TGAAGCAACT GCTAGAAGAA GAGCTGTCAA 4320 

GTCTGCTGGA CCCCAGCACA GGTCTGGCCC TGGACCGC-CT GAGCGCCCCT GACCCGGCCT 43 80 

GGATGGCGAG ACTCTCTTTG CCCCTCACCA CCAACTACCG TGACAATGTG ATCTCCCCGG 4440 

ATGCTGCAGC CACGGAGGAG CCAAGGACCT TCCAGACGTT CGGCAAGGCA GAGGCACCAG 4500 

AGCTGAGCCC AACAGGCACG AGGCTGGCCA GCACCTTTGT CTCGGAGATG AGCTCACTGC 4560 

40 TGGAGATGCT GCTGG7AACAG CGCTCCAGCA TGCCCGTGGA GGCCGCCTCC GAGGCGCTGC 4620 

GGCGGCTCTC GGTCTGCGGG AGGACCCTCA GTTTAGACTT GGCCACCAGT GCAGCCTCAG 46B0 

GCATGAAAGT GCAAGGGGAC CCAGGTGGAA AGACGGGGAC TGAGG3CAAG AGCAGAGGCA 4 740 

GC AG C AG C AG CAGCAGGTGC CTGTGAACAT ACCTCAGACG CCTCTGGATC CAAGAACCAG 4 800 

GGGCCTGAGG ATCTGTGGAC AAGAGCTGGT TTCTAAAATC TTGTAACTCA CTAGCTAGCG 4 860 

45 GCGGCCTGAG AACTTTAGGG TGACTGATGC TACCCCCACA GAGGAGGCAA GAGCCCCAGG 4 920 

ACTAACAGCT GACTGACCAA AGCAGCCCCT TGTAAGCAGC TCTGAGTCTT TTGGAGGACA 4 9B0 

GGGACGGTTT GTGGCTGAGA TAAGTGTTTC CTGGCAAAAC ATATGTGGAG CACAAAGGGT 5040 

CAGTCCTCTG GCAGAACAGA TGCCACGGAG TATCACAGGC AGGAAAGGGT GGCCTTCTTG 5100 

GGTAGCAGGA GTCAGGGGGC TGTACCCTGG GGGTGCCAGG AAATGCTCTC TGACCTATCA 5160 

50 ATAAAGGAAA AGCAGTGATT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

Seq ID No: 175 Protein sequence: 
Protein Accession #: NPJ357664.1 

55 

1 11 21 31 41 51 

I I I I I I 

MMQLLQLLLG LLGPGGYLFL LGDCQEVTTL TVKYQVSEEV PSGTVIGKLS QELGREERRR 60 
QAGAAFQVLQ LPQALPIQVD SEEGLIiSTGR RLDREQLCRQ WDPCLVSFDV LATGDLALIH 120 

60 VEIQVLDIND HQPRFPKGEQ ELEISESASL RTRI PLDRAL DPDTGPNTLH TYTLSPSEHF 180 
ALDVIVGPDE TKHAELIWK ELDREIHSFF DLVLTAYDNG NPPKSGTSLV KWVLDSNDN 240 
SPAFAESSLA LEIQEDAAPG TLLIKLTATD PDQGPNGEVE FFLSKHMPPE VLDTFSIDAK 300 
TGQVILRRPL DYEKNPAYEV DVQARDLGPN PIPAHCKVLI KVLDVNDNIP SIHVTWASQP 360 
SLVSEALPKD SFIALVMADD LDSGHNGLVH CWLSQELGHF RLKRTNGNTY MLLTNATLDR 420 

65 EQWPKYTLTL LAQDQGLQPL SAKKQLSIQI SDINDNAPVF EKSRYEVSTR ENNIjPSLHLI 480 
TIKAHDADLG INGKVSYRIQ DSPVAHLVAI DSNTGEVTAQ RSLNYEEMAG FEFQVIAEDS 540 
GQPMLASSVS WVSLLDAND NAPEWQPVL SDGKASLSVL VNASTC-HLLV PIETPNGLGP 600 
AGTDTPPLAT HSSRPFLLTT IVARDADSGA NGEPLYSIRS C-NEAHLFILM PHTGQLFVNV 660 
TNASSLIGSE WELEIWEDQ GSPPLQTRAL LRVMFVTSVD HLRDSARKPG ALSMSMLTVI 720 

70 CLAVLLGIFG LILALFMSIC RTEKKDNRAY NCREAESTYR QQPKRPQKHI QKADIHLVPV 780 
LRGQAGEPCE VGQSHKDVDK EAMMEAGWDP CLQAPFHLTP TLYRTLRNQ3 NQGAPAESRE 840 
VLQDTVNLLF NHPRQRNASR ENLNLPEPQP ATGQPRSRPL KVAGSPTGRL AGDQGSEEAP 900 
QRPPASSATL RRQRHLNGKV SPEKESGPRQ ILRSLVRLSV AAFAERNPVE ELTVDSPPVQ 960 

QISQLLSIiLH QGQFQPKPNH RGNKYLAKPG GSRSAIPDT3 GPSARAGGQT DPEQEEGPLD 1020 

75 PEEDLSVKQL LEEELSSLLD PSTGIiALDRL SAPDPAWMAR LSLPLTTNYR DNVISPDAAA 1080 

TEEPRTFQTF GKAEAPELSP TGTRLASTFV SEMSSLLEML LEQRSSMPVE AASEALRRLS 1140 
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VCGRTLSLDL ATSAASGMKV QGDPGGKTGT EGKSRGSSSS SRCL 



1 11 21 31 41 51 

I I I I I I 

GAGTCTCTTT GGGCCAGCCG GGCTGCTGCA GACAGACAGG AAGCACGCCT GACGCTCCTC 
TACCCTCGGG CAGCACAGCG GGGCTGGGAC TCACTCTAGC TTGCCCAGCA ACTTGCTTTC 
CTGTGTGAAC TCTGGCAGGC TGCCCTCTCT GTGCAAAGCT GCCACTGGGG CCTGCTCAGG 
GTGGCCTGGA ACTTGGAGGT GGGCAGTCAG GGCCTAGGAT GGGCCTGTGT CACCAGGGCA 
TGTGCCCTTG GGCCAGTTAC TTCCTCTCAG AGCCTTGGGC TCCTCCTCTG AGGATGGGGC 
TTGTTGGTGT GAAATGAGGT GAGCATGTTG AGTTGGGGAG CAGCAGGACA CGCACCTGCA 
GGCAGCCGCC CTGGCCACGC TCCCTCCCTA CCTTCCGAGT CGTGGGACAG ACACAGTAGA 
GCACAGCGGG CCAGCCTGCT CTCTTCTCTG TCTACTTTTT GCAGAAGAGT CAACAGATAC 
AACAGGCCCA GGGAGGTGCC CCTGGGGGCC CCAGTCCCCA TCACTCCAAG GGGCAGTCCT 
GCAAGTGACA AGGTGGGCCC AATCCCTGTG GAACAGGTCT CTGAGGACCA CAGAGTGGGG 
CCCCAGGGAA AGCTGGGAGC CGAGCTAGAG GCAGGCAGCA AGTAAGGGCA AAGCTGTGCC 
CCTGCCCGGA AGACCTTCCT GCCCCCAGAA CCCGACCCTC CGCAGATAGC CCTCCCTGGG 
CAGCAGCCCC CCAGCTTCCA AGGCCCGTGC CTCACCAGAC GCCATGCTCT CACGGACTTG 
TTTGCTGCTC TGTACCCTGC AGATCTGCCC CAGAGGAGCA GGTGAAAAGC CGCGCCTGCC 
GAGGTGCTGT GGCGGTGGAG TTTTGGGCAG AGGAGTGGGG GGAAGAGTTT CTCACTTTTA 
AGATTCTCCA AATCCAAGAT GAAGTCATGC TGTGCTTTGG AATGGTAGAT GCTCATTTAT 
GTAAAATCAT AATAAATGTT ACACAAACTG TTAAAAAAAA AAAAAAAAAA AAAAAA 



Nucleic Acid Accession #: none found 

Coding sequence: 3-107 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I 1 

AATG GAGCAC TCCAAAGAAC GATTTGACCA ATAGCATTTC TTCTCTGGGG GTTGTATTTC SD 

AAAGCATGCA ACTCTCCAGG GAACCAGAAC TAAATTGCTT AAAATGAAGT CATTCCTCAG 12 0 

ATTAACTTCC TCAGATAAAG TGTCAGCGGT CTGCAGAAAC GAAGAAGACA AAACTGAGAT 180 

TATCACTCAT AATTCTCTTA CTTACTATGT CAGTGAAACA ATGAGTTTGC ATTTTTGCAA 240 

TCCTAGAACA TTCTTCATTA GCCCTGGGTC ATGACCTCTT CCAGTTAATT CTCTTTCACA 30 0 

CCTTTAGGAA AGATTTAAGA TGAACCTTCA ATAGGATATT AACATAACTC ATAGCCAATA 360 

CCACAGCTGC CTTTCAAATT AATGAGGTTA ATTGTTCTCC AGCAAACATG AGTTTGTCTT 420 

TGGCATTTTA AATGCTTCCC ATTGATCTGA CATTTTGCTG TTTCAAGTTT TAAAGGGCTC 480 

AAATCAAAGA CTATTGATAA CTGAGCAAAG AGCGAAGATC CAGAAATACG AAAACATTGT 54 0 

CTTTTTTTTT CCATGAAAAA CAATCATAGC CTTTTGAATT CAATCGAAGT TTCTACATTA 600 

GCCATCTAAG ACTTATTTAA TTATTTCTGT TCTCAGTCAA GCTAATTCAA GTGAATGAAC 660 

AGTATTGACT TTTAAAATCT TTTTTAAATT TTTTTAAATC TTTAGTTTAT TAAGTTTGTA 72 0 

GAAAAGCTCT GGGGCCATGA CCACTTACGT AAATGTTTCA GTTTAAAAAC AAAAGATTCA 78 0 

GGCCTCTAAT TTGAGCCAAA TCCAGGTGAT CTTGTTTGAA ATTTTTGATG AATTTGAAAA 84 0 

GATGAAAGTG GAACTTTTAA CATTCATGTT CCCCAAATTT TTCACTGGGA AGGGATGCTA 900 

ATTGCCTACT TAAGATATAA GTTCAAGAAT AACATTTTCA TAGAAAATTC AGAAAACTGC 960 

TTGACACAGC AGTGACATAG TTAGATGTGG CTCAGATGCC TTCCAAACCT GAGGGTCCCC 102 0 

AAAGATTTCT TTACCAGTTG TTTTTAACTA TGAATCTTAA TCTTGTTCAT TCCCCTGCCA 1080 
AAACAAATTT AAAAG 



Seq ID NO: 180 DNA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 2-176 (underlined sequences correspond to start and stop codons) 
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CCGGGTGGGG 
CAGAAGGGCT 
TGTCCGTCCT 
TGGGGGCTAC 
CAACTTGGTG 
CGTGGTGGCC 



CGATGTCTAC 
AACCCCGCCG 
TAATAGTTCC 
ACCTTGACAT 
TAAATCACAA 



CCTCGGGATG 
TTCGAGGAGG 
CATAGTCACC 
TACCCAGGGA 
GAGAATAGAA 
GCCTTCTGCT 
CTCACCACGG 
CAGACAGAGG 
ACCCCACGCG 
TGGGGGTCAC 
TTCGATGTGC 
CAGTCTCAGA 
GAGTTATCCC 



CAGGCGCCGG TGCCCGGGCC CCTGGGCCT3 
TGTGGGGTCT 
AGGACGGAGA 
TTCTTAGGAA 
ATCGTGTTTA 



GCGCCATCGT 



TGAGCAGGAG 
GTTTTCCCTG 
CCCTGCAGCT 
TGTATTTCAC 
AAACAACCGC 



CGCTCTGGTT 
CTGCATCAGC 
CTTTGGATCT 
GGTGGCAGCG 
GGACGGCGTA 
GTTTTACTCC 
CACTGAGATT 



TTTI 
AGTGGGGTGG 



GGTGAACCGT 
TCTGGAGTCA 
ACCACCCCGC 



TGGTCTGAAT 
TGATGCCCCC 
GAGTTCTGGA 
AATCCCACCA 



CTGGACCCCG 
CTGCTGCTGG 
ATGTGACCGT 
TTATTGGCAT 
TCAGTTTTGG 
AGCACATTGA 
GGTACTTGTA 
TTGCTCAGCT 
ATTTGGATTC 
TGTGTAAGGG 
CTTGCTTCAT 



Nucleic Acid Accession #: AK001579.1 

Coding sequence: 1150-2637 (underlined sequences correspond t 



start and stop codons) 



I 1 
TTTTCTCTGC TTTTCGCTAC 
CCCCCATCCC CCTTTCTCCT 
GCCACCAGCT GCTGGGCCCC 
CCCATACAGC CCCGGCCCCT 
ACCTCTTCCT GTGCTCAGCG 
TGCGGCGGCT ACAGGAGATC 
ATTTGGTCCT GGTGGAGACA 
TCACGGCATG GAACGCAGCC 
AGCAGCAGAT GAGCCGGGGT 
CCCAGCATGG GCTCCGGCTG 
TGAGACTCCT GGCTGAGTTC 
ACTTTGTGGA GGATGTCACT 
TGACCTCTGC ACGGTTGCTG 
AGAGCCAAGG CCCAACCAGG 
CTCCGCCCCA TCACTTCCCA 
GGAAGGGGGC AGAGACACAT 
AGCTGCCACT CTTTCTTCCC 
CCCCAGAAGA ATCAGCGCCT 
AACCGCCGCA CACTGGCCAC 
CTAAACCAGAJTGTGCACGCG 
GATGGGCGAG GGGAGCACGA 
GTCTTTGATA TCGATTCTGA 
ACCTGGAAGG ACGTGCAGCT 
CAGCAGCTCC CAGACAACTG 
CTGACTAACC AGGTACTGGA 
TTTGAGATTC GCGAGCATGG 
GAGCAGGCTT TACAATGGTG 
AAAGTCCCCC TGGCCCAAGC 
GTGGGGCTGT TGCGGTGTCG 
AGGTTCTTTC 
CCAGAACGGG 
AAGCCCCCAA CACCGTGGGG 
TGCACTGACG AGGATGAAAT 
GACCAGCAGC CAGTGG'TCTT 
GGCACTATGC CTTTGCTGCC 
AATCAGACCC TGCGGCGACT 
TCATCCCAGG GGTCTGTGGA 
GTGTATGAGG AAGTAGGGGC 
ACCACACGGG AGTGGACAGT 
CAACCCTTTC TCTCCAAGTC 
CCAGGCCCCC CTTCAAAGAG 
CAGGAGCTCA GCAGCCTCAT 
TCCCAGCCAT CCAGCCCCCA 
TTCCCCACCC AACCCCCATG 
TAGGACCAGC AGTCTGAGAG 
GTTGCAGCTT CCTCTGCCCT 
GAGCACTGGA CTAAAGGCTT 
GGGCCCAGCC CATTTATCTA 



CCCGGTCACT 
GTCCTCCCCC 
GGGCTGCTGG 
GGTCTCTGGC 



AGTGTGGTTT 
GGAAGGACCC 
ATTGGGGGCG 
GACATCCCCA 



GGCTGGGCCG 
TGTCAGGGTT 
GCCCCCCAGC 
CTGCAGCTGA 



CCTCGCTGGA 
ATCTCTGCCT 
CCCTGGGGGA 
CCATCCTGCA 



GGAGAAATAT 
CCTCATTGGG 
GAACTTGGCT 
GGTGCGAGTG 
CCAGGTAGCT 
GTCTCAGGCT 



GATGCGGGGG 
GGAGCTGGAG 
CCAGCTCCCA 
TGGCTGCCTC 
TGAGGAGCCA 
CCGCTGCCTG 
GGAAGGTGCC 
CTTCACATTG 
GTGGGATTGG 
ACGACGCCAT 
TATCCGTGGG 
ACACAACCGG 
GGAGCAAGAG 
CTTCCCTGAG 
GAAGCCAGAG 
AAGCACCCTT 
CAGTCCCCAG 
CCTGAGGAAA 
ATCCCCCAGC 
CACTTCCAGT 
GGTAGGTACC 
GGCTGGAAAG 
CAGTGGCTGC 



CGGCTGGTGG 
TCATCGTGGA 
ACCGGAAAGG 
CCCG3TCGGT 
AACGCTTCTT 
GGGAGGCTGC 
TCCCCCACCA 
TCATCCAGAG 
TTTGTGCCTA 
CAACCCTCCT 
AAAGATGTGA 
CATCTCTATC 
CTGCTGTTTG 
CTGCAAGAGC 
CAGATTGACT 
GGAGACCTCA 
AAGGTGTCCC 
ACAGCAGCTG 
CGGCCACTGC 
GAGCCCTGCT 
TTCACAGGTA 
CCTCGCTTGC 
CTGCTGCTCA 
AAGGTCTACC 
ATACTAGAGA 
ACCACCAGCA 
TCCTCCTCTG 
GATGACAGTG 



41 

TCCCCTATTC 
GTGGTTCTCC 
CCTATGGCTG 
TGGCCTCCTT 
CCCTGAGGAC 
CACCCCAGAT 
AGGAGAGGGC 
GGGCGGCACA 
TGCCTGCATC 
GGGCGCTCGT 
GAAGCTCCGA 
TCGTGAGCTC 
TGGTATTCCT 
GAATCCATGG 



CTTGTCTCTT 60 

CCGCTGAGCT 12 0 

CGGTCCCCCT 180 

CGTGGTGACC 24 0 

ATGGTGCATC 300 

AAGAAAGAGC 3 SO 

CGGCTGGACT 42 0 

GGGCTGCAGG 480 

AGTTTTGTTA 540 



GAGCTGGAGG 
TTGATCCAGG 
AACCCCCTCA 



GCACGGGGGT 



CCCACTGGCC 
CCACCCTCCA 
AGAAGACCCA 
ACTCCAGAAT 
GTGTCCCAGG 
TAACTGAAGT 



AAAATCCCTC 
CCATTCCATC 
TTGGCTGCCT 
GGGTGCAGAA 
CACCCAGCGT 
TCATTGATGG 
TGGAGGTCAG 
TCATGGAAGT 
CAACCCTGAC 
GGATGGACTT 
ATCCCAAGGA 
CAGCTTCCCT 
TCCGACGTGA 
TGGGAAGCCG 
AGGAGAAGAA 
TGGGAATCCG 
AGATGCACCT 
TCCTTAAAGC 
ACCTTGCCCG 
GAGCCACCCT 
CCATGTTCTT 
AGCCTGTGTA 
ACACTTCTAC 
CCAGCCAGAA 
AGAGGCCACC 
CCCTAGAGGA 
CTGCAGGCCT 
TTCCAACACA 



CCAGGGGAGC 
GATGACCCTG 
AAGATCCCTG 
TTTGGCAGCC 
GGGGGAGGTG 
CCTCTGTACC 
CCCAGAGCTG 
GCCGCGGGTC 
ATGTGCGGCT 
GTTCCAGACG 
CTACATCTCT 
TCTTATCACC 
TTATATAGAG 
TGCTGAGGAG 



GAAACTCTTA 
CCAGTGTGGT 
ACAGGTCATG 



CTTCCAGGAG 
AAGCTCTAAA 
CAAGAAGTTA 
CTACTTGTCC 
CCAGCACGAT 
TCAGAAGTTT 
CCTCTCTGCC 
TCCAATGAAG 
CGAGGAGCCA 
CTCCTTCTCC 
GTCATTGGAT 
TGAGCCCCCT 
ACAGCTGCTC 
GGGAAGTCCT 
GACACCTGGC 
CACATGACCC 
TCGTGGCACT 
GCTGTGGAAG 
GCCCCTCTCT 
GTGAATGTCA 



1320 
1380 
1440 
1500 
15S0 
1620 



2280 
2340 
2400 
24S0 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
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AACTGTGTTT CTTAGAGCCA TAAGCCCCAC ATATTATCCC TGAACAAGGG CAGCTCCTGC 2940 

TTTATATATT TGATACGTAG GGGTTCCATG AGAGATTTTG GGTTTTAAAG GAATGGTTTT 3000 

ACTGCATTAA AGAAAAAAAA TGCTTTGGAA ACCAGAGGCC TG3GTGATGT TAAAGTCTAT 3050 

CCTGTCCCAC TTCCTACATT CTGGGACTAC CGTGAAGCCT GGAGTAGGGA GAGCGAGTTT 3120 
GGGAGCTGGG ACTCGGGGAG T C AAAAAT AG ATGAGTAATT GTCAATAAAC CTGGGAACC 



RLEKYKDVIG 
HEVRVLQELI 
NCVTLKVSPT 
WCQLPEPCSA 
RGRCLLLLKE 
EMWDWTTSIL 



SLLLKKVPLA 
KKSSKPEREW 
KAQHDDQQPV 



21 
I 

CCLAGGRLLV 
HIHPAFVPKN 
ATLIGHLYRV 
SDQVAQIDLE 
LEMRGTAAGM 
QAGCLFTGIR 
PLEGAKVYLG 
VLRRHSSSDL 
VEEQEELEEP 
KSSTLGQEER 
PQSPSPTGLP 



FLRSLRAKAQ PGSLPSPTRI HGLAALRPIT 
PSLCTSCHSF FP3PPQPSSI PSPELPQKNQ 
QKCAALNQMC TRKLALLFAP SVFQTDGRGE 
VSLITTWKDV QLSQAGDLIM EVYIEQQLPD 
DLWVTFEIRE HGELERPLHP KEKVLEQALQ 
RESPRVGLLR CREEPPRLLG SRFQERFFLL 
IRKKLKPPTP WGFTLILEKM HLYLSCTDED 
ARQKFGTMPL LPIRGDDSGA TLLSANQTLR 
VYEEPVYEEV GAFPELIQDT S 
PPEPPPGPPS KSSPQARt 
TQTPGFPTQP PCTSSPPSSQ PLT 



Seq ID NO: 184 DNA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 1-81 (underlined sequences correspond t 



start and stop codons) 



11 



21 



GTA GAGTTAG 
GTATTAAGCT 
AATCTCAACA 
CTTTCTCTAT 
AACCCTAACT 
CAOTGCACCA 
ACTGAAGATG 
AACATGTAAT 
TGTTCAGCTG 
AAGTTTTGTT 
ACAAAGGCAT 



I I 
TGTCAATGTG CTTAGAATAT 
TAAAAAGTTA_^ATTCAGTTTA 
TAAGAAGTCA AAATGTAATG 
AATTTCATCA GTATGTCCTC 
CTGCTAATTA TAAGCTAGGC 
GCTTTGTTAT CTGTAAAATG 
AGAGAACATG ATATGTGTAA 
GAATGTAGTA ATAGTAATTA 
TAACAGAATA CCCAAAATAA 
ACGAATTCAG ACAATCCAGG 
CTTTCCTGAT TTCTGCCAGT 
TCCAGCCCTG GTTACGCCCA 
CACTTTAATC ATAGCTCCCA 
ATGCTTGCTT ACATGGTCAC 
CAATTAATAT CCAAGTGTCC 
GCATTCTTCT TAGCATAAGC 
CTCAGTTGTC CTTTCATTTA 
ATCAGCACCA TCACTACCAC 



TCCCTTTTCT 
AAGTAATCTT 
ATGATAATAC 
AGTGCCTTCC 
TTTTATTTTC 
CAGTTTTAAA 
GCTTTTATAG 
CTCAATGCAT 
TATTAGCACA 
CTAGATGCAC 



I 

. TAAACATTTT 
, ACCAAATTAT 
ACAATATCAA 
CCTATTTGTC 
GGACAAGTTA 
CAACACCTTC 
ACAATACCCA 



ACTTGGATGC 

CAGGATGGTT 
AGTAAGCTTT 
TGGCTTTTCA 
GCTTATTCTT 
CACCTCCTTG 
GCACTTAAAA 
TGCAGGTTAG 
CACCGTTCTA 
GACACAGTCA 
TTTTGCAGAA 



Seq ID No: 185 Protein s 



TGAAAATGTA 
TATTACTCAA 
CCTCTTACTT 
CTGCTTGTCC 
TAGAAAAAAA 
AAGTCATGGA 
TTCAACCCCA 
GCTCTTTCTG 



AAGAAAACCT 
TATCTCTGTA 
TATACTCCTA 



GCTGGGATCT 
TTGCCTAGAG 
AGAAAAGGCA 



AATTACTGAG 
CACATTCTTT 
GAAATGCTCC 
TGCCTTCTTC 
TTTTTGCTTC 
TGTCTCTCTC 
AAAACTTTGC 
TTGCTCTTCC 
AAAGAAGTTC 
AGCTCAGAGA 
AAATCCATGT 
GTGCTTGATT 
GCTCAGCATT 



CAGAGAGACA 
AAAAAAAGAA 
TTATGAAGTT 
TTGGACATCC 
AAAGCCACCA 
TACTTCCACA 
TTCTCTTCAA 
TGTGAGATCT 
TCACTCTACT 
AGTCTTCCCT 
GGCTGAGTTG 
CAGTCTGACT 
GCTCAATAAT 



CTCTAAAAAA 
TTTATATTTG 
AGGTATTTTT 
AAATTTTAGC 
TTTGACCTCT 
TTCTTGGGGT 
GAACATAGCA 
GTTGGGACTA 
TTTTGTTGTG 
ATCAGCAGGT 
CCAGAGTCCA 
GAGAAAGGGA 
TGCTGATACT 
TGTCTGGACA 
ACTAGCACCT 
GTCCTCAGTT 
TGAATCTGAC 
CGTTCTGTCC 



AACCCTCCGA 
ATGTGACCCG 
CCAGCCATCC 
TATGATATTT 
TCCAAGGTCA 
ATGAGTTCTG 
AGATTCCATG 
CCGCACATCC 



1380 
1440 
1500 
1560 
1S20 



Protein Accession #: none found 



Seq ID NO: 186 l_ 

Nucleic Acid Accession #: NM_002203.2 
Coding sequence: 43-3588 (underlined sequences correspond to start and stop codons) 



I I I I I I 

CTGCAAACCC AGCGCAACTA CGGTCCCCCG GTCAGACCCA GGATGGGGCC AGAACGGACA 
GGGGCCGCGC CGCTGCCGCT GCTGCTGGTG TTAGCGCTCA GTCAAGGCAT TTTAAATTGT 
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TGTTTGGCCT ACAATGTTGG TCTCCCAGAA GCAAAAATAT TTTCCGGTCC TTCAAGTGAA 18 0 

CAGTTTGGGT ATGCAGTGCA GCAGTTTATA AATCCAAAAG GCAACTGGTT ACTGGTTGGT 24 0 

TCACCCTGGA GTGGCTTTCC TGAGAACCGA ATGGGAGATG TGTATAAATG TCCTGTTGAC 30 0 

CTATCCACTG CCACATGTGA AAAACTAAAT TTGCAAACTT CAACAAGCAT TCCAAATGTT 360 

ACTGAGATGA AAACCAACAT GAGCCTCGGC TTGATCCTCA CCAG3AACAT GGGAACTGGA 42 0 

GGTTTTCTCA CATGTGGTCC TCTGTGGGCA CAGCAATGTG GGAATCAGTA TTACACAACG 48 0 

GGTGTGTGTT CTGACATCAG TCCTGATTTT CAGCTCTCAG CCAGCTTCTC ACCTGCAACT 540 

CAGCCCTGCC CTTCCCTCAT AGATGTTGTG GTTGTGTGTG ATGAATCAAA TAGTATTTAT SOO 

CCTTGGGATG CAGTAAAGAA TTTTTTGGAA AAATTTGTAC AAGGCCTTGA TATAGGCCCC 660 

ACAAAGACAC AGGTGGGGTT AATTCAGTAT GCCAATAATC CAAGAGTTGT GTTTAACTTG 72 0 

AACACATATA AAACCAAAGA AGAAATGATT GTAGCAACAT CCCA3ACATC CCAATATGGT 78 0 

GGGGACCTCA CAAACACATT CGGAGCAATT CAATATGCAA GAAAATATGC CTATTCAGCA 84 0 

GCTTCTGGTG GGCGACGAAG TGCTACGAAA GTAATGGTAG TTGTAACTGA CGGTGAATCA 90 0 

CATGATGGTT CAATGTTGAA AGCTGTGATT GATCAATGCA ACCATGACAA TATACTGAGG 960 

TTTGGCATAG CAGTTCTTGG GTACTTAAAC AGAAACGCCC TTGATACTAA AAATTTAATA 102 0 

AAAGAAATAA AAGCGATCGC TAGTATTCCA ACAGAAAGAT ACTTTTTCAA TGTGTCTGAT 108 0 

GAAGCAGCTC TACTAGAAAA GGCTGGGACA TTAGGAGAAC AAATTTTCAG CATTGAAGGT 114 0 

ACTGTTCAAG GAGGAGACAA CTTTCAGATG GAAATGTCAC AAGTGGGATT CAGTGCAGAT 120 0 

TACTCTTCTC AAAATGATAT TCTGATGCTG GGTGCAGTGG GAGCTTTTGG CTGGAGTGGG 126 0 

ACCATTGTCC AGAAGACATC TCATGGCCAT TTGATCTTTC CTAAACAAGC CTTTGACCAA 132 0 

ATTCTGCAGG ACAGAAATCA CAGTTCATAT TTAGGTTACT CTGTGGC7GC AATTTCTACT 138 0 

i CTCACTTTGT TGCTGGTGCT CCTCGGGCAA A7TATACCGG CCAGATAGTG 1440 

3 TGAATGAGAA TGGCAATATC ACGGTTATTC AGGCTCACCG AGGTGACCAG 150 0 

ATTGGCTCCT ATTTTGGTAG TGTGCTGTGT TCAGTTGATG TGGATAAAGA CACCATTACA 156 0 

GACGTGCTCT TGGTAGGTGC ACCAATGTAC ATGAGTGACC TAAAGAAAGA GGAAGGAAGA 152 0 

GTCTACCTGT TTACTATCAA AAAGGGCATT TTGGGTCAGC ACCAATTTCT TGAAGGCCCC 168 0 

GAGGGCATTG AAAACACTCG ATTTGGTTCA GCAATTGCAG CTCTTTCAGA CATCAACATG 174 0 

GATGGCTTTA ATGATGTGAT TGTTGGTTCA CCACTAGAAA ATCAGAATTC TGGAGCTGTA 180 0 

TACATTTACA ATGGTCATCA GGGCACTATC CGCACAAAGT ATTCCCAGAA AATCTTGGGA 1860 

TCCGATGGAG CCTTTAGGAG CCATCTCCAG TACTTTGGGA GGTCCTTGGA TGGCTATGGA 192 0 

GATTTAAATG GGGATTCCAT CACCGATGTG TCTATTGGTG CCTTTGGACA AGTGGTTCAA 198 0 

CTCTGGTCAC AAAGTATTGC TGATGTAGCT ATAGAAGCTT CATTCACACC AGAAAAAATC 204 0 

ACTTTGGTCA ACAAGAATGC TCAGATAATT CTCAAACTCT GCTTCAGTGC AAAGTTCAGA 2100 

CCTACTAAGC AAAACAATCA AGTGGCCATT GTATATAACA TCACACTTGA TGCAGATGGA 2160 

TTTTCATCCA GAGTAACCTC CAGGGGGTTA TTTAAAGAAA ACAATGAAAG GTGCCTGCAG 2220 

AAGAATATGG TAGTAAATCA AGCACAGAGT TGCCCCGAGC ACATCATTTA TATACAGGAG 2280 

CCCTCTGATG TTGTCAACTC TTTGGATTTG CGTGTGGACA TCAGTCTGGA AAACCCTGGC 234 0 

ACTAGCCCTG CCCTTGAAGC CTATTCTGAG ACTGCCAAGG TCTTCAGTAT TCCTTTCCAC 2400 

AAAGACTGTG GTGAGGATGG ACTTTGCATT TCTGATCTAG TCCTAGATGT CCGACAAATA 2460 

CCAGCTGCTC AAGAACAACC CTTTATTGTC AGCAACCAAA ACAAAAGGTT AACATTTTCA 2520 

GTAACACTGA AAAATAAAAG GGAAAGTGCA TACAACACTG GAATTGTTGT TGATTTTTCA 2580 

GAAAACTTGT TTTTTGCATC ATTCTCCCTA CCGGTTGATG GGACAGAAGT AACATGCCAG 2640 

GTGGCTGCAT CTCAGAAGTC TGTTGCCTGC GATGTAGGCT ACCCTGCTTT AAAGAGAGAA 2700 

CAACAGGTGA CTTTTACTAT TAACTTTGAC TTCAATCTTC AAAACCTTCA GAATCAGGCG 276 0 

TCTCTCAGTT TCCAAGCCTT AAGTGAAAGC CAAGAAGAAA ACAAGGCTGA TAATTTGGTC 282 0 

i TTCCTCTCCT GTATGATGCT GAAATTCACT TAACAAGATC TACCAACATA 288 0 

3 AAATCTCTTC GGATGGGAAT GTTCCTTCAA TCGTGCACAG TTTTGAAGAT 294 0 

GTTGGTCCAA AATTCATCTT CTCCCTGAAG GTAACAACAG GAAGTGTTCC AGTAAGCATG 300 0 

GCAACTGTAA TCATCCACAT CCCTCAGTAT AC C AAAGAAA AGAACCCACT GATGTACCTA 306 0 

ACTGGGGTGC AAACAGACAA GGCTGGTGAC ATCAGTTGTA ATGCAGATAT CAATCCACTG 312 0 

AAAATAGGAC AAACATCTTC TTCTGTATCT TTCAAAAGTG AAAATTTCAG GCACACCAAA 318 0 

GAATTGAACT GCAGAACTGC TTCCTGTAGT AATGTTACCT GCTGGTTGAA AGACGTTCAC 324 0 

ATGAAAGGAG AATACTTTGT TAATGTGACT ACCAGAATTT GGAACGGGAC TTTCGCATCA 3300 

TCAACGTTCC AGACAGTACA GCTAACGGCA GCTGCAGAAA TCAACACCTA TAACCCTGAG 3360 

ATATATGTGA TTGAAGATAA CACTGTTACG ATTCCCCTGA TGATAATGAA ACCTGATGAG 3420 

AAAGCCGAAG TACCAACAGG AGTTATAATA GGAAGTATAA TTGCTGGAAT CCTTTTGCTG 3480 

TTAGCTCTGG TTGCAATTTT ATGGAAGCTC GGCTTCTTCA AAAGAAAATA TGAAAAGATG 354 0 

ACCAAAAATC CAGATGAGAT TGATGAGACC ACAGAGCTCA GTAGCTGAAC CAGCAGACCT 360 0 

ACCTGCAGTG GGAACCGGCA GCATCCCAGC CAGGGTTTGC TGTTTGCGTG CATGGATTTC 3660 

TTTTTAAATC CCATATTTTT TTTATCATGT CGTAGGTAAA CTAACCTGGT ATTTTAAGAG 372 0 

AAAACTGCAG GTCAGTTTGG ATGAAGAAAT TGTGGGGGGT GGGGGAGGTG CGGGGGGCAG 378 0 

GTAGGGAAAT AATAGGGAAA ATACCTATTT TATATGATGG GGGAAAAAAA GTAATCTTTA 384 0 

AACTGGCTGG CCCAGAGTTT ACATTCTAAT TTGCATTGTG TCAGAAACAT GAAATGCTTC 3900 

CAAGCATGAC AACTTTTAAA GAAAAATATG ATACTCTCAG ATTTTAAGGG GGAAAACTGT 3960 

TCTCTTTAAA ATATTTGTCT TTAAACAGCA ACTACAGAAG TGGAAGTGCT TGATATGTAA 402 0 

GTACTTCCAC TTGTGTATAT TTTAATGAAT ATTGATGTTA ACAAGAGGGG AAAACAAAAC 408 0 

ACAGGTTTTT TCAATTTATG CTGCTCATCC AAAGTTGCCA CAGATGATAC TTCCAAGTGA 414 0 

TAATTTTATT TATAAACTAG GTAAAATTTG TTGTTGGTTC CTTTTATACC ACGGCTGCCC 42 00 

CTTCCACACC CCATCTTGCT CTAATGATCA AAACATGCTT GAATAACTGA GCTTAGAGTA 4260 

TACCTCCTAT ATGTCCATTT AAGTTAGGAG AGGGGGCGAT ATAGAGACTA AGGCACAAAA 432 0 

TTTTGTTTAA AACTCAGAAT ATAACATTTA TGTAAAATCC CATCTGCTAG AAGCCCATCC 43 8 0 

TGTGCCAGAG GAAGGAAAAG GAGGAAATTT CCTTTCTCTT TTAGGAGGCA CAACAGTTCT 444 0 

CTTCTAGGAT TTGTTTGGCT GACTGGCAGT AACCTAGTGA ATTTTTGAAA GATGAGTAAT 45 00 

TTCTTTGGCA ACCTTCCTCC TCCCTTACTG AACCACTCTC CCACCTCCTG GTGGTACCAT 4560 

TATTATAGAA GCCCTCTACA GCCTGACTTT CTCTCCAGCG GTCCAAAGTT ATCCCCTCCT 462 0 

TTACCCCTCA TCCAAAGTTC CCACTCCTTC AGGACAGCTG CTGTGCATTA GATATTAGGG 468 0 
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GGGAAAGTCA 
TTCAGGGAGC 
CTAAATGTTG 
CCAGAAGTTA 
GCTGTCTTGT 
GTTCAAAAGG 
TTTTAACAAG 
ATTTCTACTT 
CTTAGATTAA 
AAACTTCAGA 
TTTAGTTTTA 
CAAGAATTTG 



CAGATGAGGC 



TTACACACTT 
TAGTGCTAAA 
GGATGTAAAC 
ACTGGAAACC 
CTTTTTCTTC 



TAGATCCTGA 
CACCCCAGTC 
TTTGCACCTT 



GTCCTCATTA 
AAAGTCTATG 
ACTTGGAAAA 



ACTAGGATGC 
ATTTTCTCTG 
CACTACATAT 
TAAAATGGGA 
ATCTGATCTG 



AATGTAAAGT 
ACCACCAAAT 
CACAAGAGTG 
TCAGATTGGG 
AGATGGACCA 
TTCCTGAGCC 
CTAAAGCTTT 
AGACTGAGCT 
GACTTCCTAT 



CTGTATATAA 
AAATAAGCTA 
AAAACACTCT 
TAGCAGGTGC 
AATTTGACCT 
ATAAGGCCCA 
CACTTTGAGA 
CCCACATTCT 
GACAAGTCCT 
GGAGTTCAGC 
AATACAAATA 



CAGGATTTCA 
ACCTTCTGTG 
AGGCAAGTTT 
GCAATCTGCA 
AACACCACCC 
CTAGGAGAAA 
TGACCTCTAT 
AGTGATGCTT 
CACAATCCTC 



MGPERTGAAP 
NWLLVGSPWS 
RNMGTGGFLT 
ESNSIYPWDA 
QTSQYGGDLT 
HDNILRFGIA 
IFSIEGTVQG 
KQAFDQILQD 
AHRGDQIGSY 
QFLEGPEGIE 
SQKILGSDGA 
FTPEKITLVN 
NERCLQKNMV 
FSIPFHKDCG 
IWDFSENLF 
NLQNQASLSF 
VHSB'EDVGPK 
ADINPLKIGQ 



I 



GFPENRMGDV 
CGPLWAQQCG 
VKNFLEKFVQ 
NTFGAIQYAR 
VLGYLNRNAL 
GDNFQMEMSQ 



21 

I 

QGILNCCIiAY 
YKCPVDLSTA 
NQYYTTGVCS 
GLDIGPTKTQ 
KYAYSAASGG 
DTKNLIKEIK 



31 

I 

" NVGLPEAKIF 
TCEKLNLQTS 
DISPDFQLSA 
VGLIQYANNP 
RRSATKVMW 
AI AS I PTERY 



FGSVLCSVDV 
NTRFGSAIAA 
FRSHLQYFGR 
KNAQI ILKLC 



I 

SGPSSEQFGY 
TSIPNVT3MK 
SFSPATQPCP 
RWFNLNTYK 
VTDGESHDGS 
FFNVSDEAAL 
AFGWSGTIVQ 
YTGQIVLYSV 
KKEEGRVYLF 
QNSGAVYIYN 
FGQWQLWSQ 
TLDADGFSSR 



SLIDWWCD 



DSITDVSIGA 
NNQVAIVYNI 

VNQAQSCPEH IIYIQEPSDV VNS LDLRVD I 
EDGLCISDLV LDVRQIPAAQ EQPFIVSNQN KRLTFSVTLK 
""" PALKREQQVT 
TRSTNINFYE 
NPLMYLTGVQ 
WLKDVHMKGE 



MLKAVIDQCN 
LEKAGTLGEQ 
KTSHGHLIFP 
NENGNITVIQ 
TIKKGILGQH 
GHQGTIRTKY 



FIFSLKVTTG 



AGILLLLALV 



TEVTCQVAAS 
KADNLVNLKI 
SVPVSMATVI 
NFRHTKELNC 
NTYNPEIYVI 
RKYEKMTKNP 



QKSVACDVGY 
PLLYDAEIHL 
IHIPQYTKEK 
RTASCSNVTC 
EDNTVTIPLM 
DEIDETTELS 



LEAYSETAKV 780 

NKRESAYMTG 840 

FTINFDFNLQ 900 

ISSDGNVPSI 960 

TDKAGDISCN 102 0 

YFVNVTTRIW 1080 

PTGVIIGSII 114 0 



Seq ID NO: 188 DNA sequence 

Nucleic Acid Accession #: NM_002210.1 

Coding sequence: 42-3188 (underlined sequences correspond to start i 



51 



GGCTACCGCT 
GACGGCTGCG 
TGTGCCGCGC 
GTTACTTCGG 
TCGTGGGAGC 
TCAAATGTGA 
ATAGAGATTA 
CTGTGAGGTC 
AGATGAAACA 
TTGAGTATGC 
GAGGATTCAG 



CCCGGCTTGG 
CCTCGGTCCC 
CTTCAACCTA 
CTTCGCCGTG 
TCCCAAAGCA 
CTGGTCTTCT 
TGCCAAGGAT 
GAAACAGGAT 
GGAGCGAGAG 
TCCATGTAGA 
CATTGATTTT 



GCACTTCGGC 
CGCTTCTTCT 
GTCCTGCCGA 



CCAATGTTTA 
TTGATGACAG 
ATGACTTTGT 
ATGGGAAGAA 
GATTTTCTGT 
CACCTCTCTT 
TGTCTCTACA 
TTGCACGGTT 
ATATTGCAAT 
ATGGAAGATC 
CTCGAAGCAT 
ATGGATATCC 
CCAGACCAGT 
ACAATAAAAC 
TCTGCTTAAA 
TTCTTTTGGA 
GGTCCCCAAG 
AATTGATAGC 
TTTTTATGGA 



CAGCATCAAG 
CTATTTGGGT 
TTCAGGAGTT 
CATGTCCTCC 
AGCTGCCACT 
CATGGATCGT 
GAGAGCTTCA 
TGGCAGTGCC 
TGCTGCTCCA 
AACAGGCTTG 
GCCACCAAGC 
AGACTTAATT 
TATCACTGTA 
CTGCTCACTG 
GGCAGATGGC 
TAAACTCAAG 
TCACTCCAAG 
GTATCTGCGG 
ATATCGGTTG 



GATTTCTTCG 
AACACCACCC 
ACCCGCCGGT 
GATCCATTGG 
AAAATTTTGG 
CCTGTTGGAA 
TCACAAGATA 
ACTAAAGCTG 
ATTTCGGATC 
TATAATAACC 
TATTCTGTGG 
CCAAGAGCAG 
TTATACAATT 
GACATTAATG 
GGCTCTGATG 
GGAGACTTCC 
ATAGCTCCTT 
TATGGGGGTG 
AACGCAGTCC 
TTTGGCTATT 
GTAGGAGCTT 
AATGCTGGTC 
CCTGGAACAG 
AAAGGAGTAC 
CAAAAGGGAG 
AACATGACTA 
GATGAATCTG 
GATTATAGAA 



GCCAGCCAAT 
AATTTAAGTC 
CCTGTGCCCC 
CATGCTTTCT 
TTGATGCTGA 



AAGTGGCAGA 
AATTAGCAAC 
CTGTCGGAGA 



TTACTGGCGA 
GAGATGATTA 
GCAAACTCCA 
AGACGACAAA 



AAGATAAAAA 
CATCTCAAAT 
CAATGAAAGG 



GATSGCTTTT 
CTCGGGACTC 
GTACTCTGGC 
GTCTTCCCGG 
TGTGGAAGGA 
TGAATTTGAT 
CCATCAGTGG 
ATTGTACCAT 
TCAAGATGGA 
TGGACAGGGA 
TCTTGGTGGT 
AATCGTATCT 
TCGGACTGCA 
TTTCAATGGT 
GGGAATGGTT 
GCAGATGGCT 
TGCAGATGTG 
AGAGGTGGGG 
GCTGAATGGA 
GGACCAGGAT 



CTGCTACCTC 
CCCGAGGGAA 
ATGTTTCTTC 
GGGCAGGTCC 
GCAACAGGCA 



TGGAGAACTG 
ACAAAGACTG 
TTTTGTCAAG 



TTGAAGTGTA 
CTCTCAAAGT 
TTCCCAGGAA 
CAATTCGACG 
TTTCAAGGGG 
AATTTAGAGA 
CAGCTGCTGA 



CCTTGAAGGG 
AGCCACAGAT 
TCGAGCTATC 
CCCTAGCATT 
TTCCTGTTTT 
ACTTAATTTC 
AGCACTGTTT 
GGGACTGATG 
CAAACTCACT 
TACAACAGGC 



TTTATTGGAG 
CAGGTCTCAG 
TTTGAGGTCT 
GGTTTCAATG 
TATATCTTCA 
CAGTGGGCTG 
ATAGACAAAA 
TTATACAGGG 
TTAAATCAAG 
AATGTTAGGT 
CAGGTGGAAC 
CTCTACAGCA 
CAGTGTGAGG 
CCAATTACTA 
TTGCAACCCA 



1200 

12G0 

1320 ■ 

1380 

1440 

1500 

1560 

1620 

1680 

1740 
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TTCTTAACCA GTTCACGCCT GCTAACATTA GTCGACAG3C TCACATTCTA CTTGACTGTG 1920 

GTGAAGACAA TGTCTGTAAA CCCAAGCTGG AAGTTTCTGT AGATAGTGAT CAAAAGAAGA 1980 

TCTATATTGG GGATGACAAC CCTCTGACAT TGATTGTTAA GGCTCAGAAT CAAGGAGAAG 204 0 

GTGCCTACGA AGCTGAGCTC ATCGTTTCCA TTCCACTGCA GGCTGATTTC ATCGGGGTTG 210 0 

TCCGAAACAA TGAAGCCTTA GCAAGACTTT CCTGTGCATT TAAGACAGAA AACCAAACTC 2160 

GCCAGGTGGT ATGTGACCTT GGAAACCCAA TGAAGGCTGG AACTCAACTC TTAGCTGGTC 222 0 

TTCGTTTCAG TGTGCACCAG CAGTCAGAGA TGGATACTTC TGTGAAATTT GACTTACAAA 228 0 

TCCAAAGCTC AAATCTATTT GACAAAGTAA GCCCAGTTGT ATCTCACAAA GTTGATCTTG 2340 

CTGTTTTAGC TGCAGTTGAG ATAAGAGGAG TCTCGAGTCC TGATCATATC TTTCTTCCGA 24 00 

TTCCAAACTG GGAGCACAAG GAGAACCCTG AGACTGAAGA AGATGTTGGG CCAGTTGTTC 2460 

AGCACATCTA TGAGCTGAGA AACAATGGTC CAAGTTCATT CAGCAAGGCA ATGCTCCATC 2B20 

TTCAGTGGCC TTACAAATAT AATAATAACA CTCTGTTGTA TATCCTTCAT TATGATATTG 25 80 

ATGGACCAAT GAACTGCACT TCAGATATGG AGATCAACCC TTTGAGAATT AAGATCTCAT 2640 

CTTTGCAAAC AACTGAAAAG AATGACACGG TTGCCGGGCA AGGTGAGCGG GACCATCTCA 2700 

TCACTAAGCG GGATCTTGCC CTCAGTGAAG GAGATATTCA CACTTTGGGT TGTGGAGTTG 2760 

CTCAGTGCTT GAAGATTGTC TGCCAAGTTG GGAGATTAGA CAGAGGAAAG AGTGCAATCT 2820 

TGTACGTAAA GTCATTACTG TGGACTGAGA CTTTTATGAA TAAAGAAAAT CAGAATCATT 2880 

CCTATTCTCT GAAGTCGTCT GCTTCATTTA ATGTCATAGA GTTTCCTTAT AAGAATCTTC 2940 

CAATTGAGGA TATCACCAAC TCCACATTGG TTACCACTAA TGTCACCTGG GGCATTCAGC 3000 

CAGCGCCCAT GCCTGTGCCT GTGTGGGTGA TCATTTTAGC AGTTCTAGCA GGATTGTTGC 3 060 

TACTGGCTGT TTTGGTATTT GTAATGTACA GGATGGGCTT TTTTAAACGG GTCCGGCCAC 3120 

CTCAAGAAGA ACAAGAAAGG GAGCAGCTTC AACCTCATGA AAATGGTGAA GGAAACTCAG 3180 

AAACTTAACT GCAGTTTTTA AGTTATGCTA CATCTTGACC CACTAGAATT AGCAACTTTA 3240 

TTATAGATTT AAACTTTCTT CATGAGGAGT AAAAATCCAA GGCTTTACTG CTGATAGTGC 3300 

TAATTGGCAT TAAC C AC AAA ATGAGAATTA TATTTGTCAA CCTTCTCCTT ATAAATAAGT 3360 

TCAGACATAC ATTTAATAAC ATAGGGTGAC TTGTGTTTTT AGGTATTTAA ATAATAAAAT 3420 

TTCAAGGGAT AGTTTTTATT CAATGTATAT AAGACAGGTA GTGCCTGATT TACTACTTTA 34 80 

TATAAAATAG TACCTCCTTC AGTTACTGTT TCTGATTTAA TGTACGGAAC TTTATTTGTT 3 540 

GTTGTTGTTG TTGTTGTTGT TGTTGTTTTA AAGCAGTCCA AATTTGGACC TTAGCAATCA 3 600 

TGTCTTTTGT ATAGGTACTT AATGTTAATA CATATTACAC TACAGTTTAC TTTTCAGAAT 3 660 

ACTAAAGACT TTATAACTGC ATGAACTTGG ATTTTTTTAA TCACTCATAT GGTAGAATTT 3 720 

TATAAACACA TACATGATAC CATCCAAATT CTTGCTTTTA ATAACAAAGG TACAATATTT 3780 

TGTTTTAGTA TGAAAATCTG GTAGATCCTA TTACACTTCT GTTTATATTA AATCCACAAT 3B40 

ATTTTATTAC ATTTTTAACT TGTATAAATT TTAGGTCAAA TCCTTCAAGC CAACCTATAC 3900 

TAAAAATTAG TTCCATAATC ACAAATGGCT CTTTTGTGTA ATTGTTTAAT TTCACCTGAA 3 96 0 

TATCATAATG CTTAAAGCCA TATGGAGTTG GAAATTATTT CCAAAGCATA TTTATTCCAT 402 0 

TGTTTTAGTC TGGCTATTTA CAGTATAAAA AAAGCATTTT ATTAAAATAC TGTGTAGTTC 4 08 0 

TTTGAGATAG TTGCTTATGC ATATAGTAAG TATTACATTC TTAGAGTAGA GCAGAGTTTT 414 0 

TAGTTAGTAT TAATTTATTT TCCTCCATTC ATGTACTTTT CCTTATATTT CCAAAACTGT 4200 

TACTGAGAAT GGGTCAAGAT CAGTGAGAAA TCTTTACAGT TGACAGGAAC CTGGACCCCT 4260 

TACCCCAACT TTATGAGTAA TGCTTGGAAT AAAAAACTCT TAAGGCAACT CACTGATTTA 432 0 

CTTCTAGCAA TAGCATGATG TTACAGGAAT ATTACCTCTG TTTAAGCAAG GTAATGTGTA 4380 

AAATCAGTCT CGGCTGTCAG AATAACTTCT AAAAGGTATT TTTATAAGCA GTTCAAGTTA 444 0 

CTGAAAACCT TTTAAACCTT TCTGAAGTTC GTTAGTATAA ATTACTTTTC TAGGATTATT 450 0 

AATAAAAGCC ACATAGGTGG CAAGTTGTAG TTTTATATGG CTCTGTAGAG TGGTGAACCT 4560 

TCTAGAGGAA TATATGATTT ATTCACAGTT CCTCAAGGCC TGGGGATGAT GATCAGTTAT 462 0 

ACCTATTTTT GTGCAATTAC ATCATGTTGT ACATTAGAAA TGGAGAGTTT AATAGCTCTT 4680 

TAACTGCTGT CCTCATTAGG TAATGATAAA TATTTCCCTT AAATAATTGA CTATTTTGCT 4740 

GTGTTTTAAA AATGATTGAA ATTTATCTTG CCATATCTCA TAATTTCATG CACAAGTTGA 4800 

CTGAGCTAAT CTTGAGAATA TATTCGTAAA ATAGGAGCAC ATTTAGTTGA GGTATACAAG 4860 

GTAGGACTCT AGACAAAACC TTCTATTTTA GCTTTAGTGA ATTTCAAAAG TAATGGGTCT 4920 

TGGAGTATAG ATTTTTATTA GTAGCTTGAA AGAGCTTAAT CATATGCAGT AAGTATTTTT 498 0 

ATTACCAATA AATTTAAAAT TTTTTAAGAA AAATATTTTT ATCCTAGGGC CAAGTGTTGC 504 0 

CTGCCACCAA TCAGTAAGTT AGTCTATAAC AAATTTTACC CTAACAGTTT TACCACCTAG 5100 

CAACAGTCAT TTCTGAAAAT ATGTTGGATA GAAAGTCACT CTTTGGCAAA AGTGTTAGAA 5160 

TTTGCTTTTG TGCCATCTAT TCCTTTTATG GCATCTATCT TGAAAGTAAT CTTGTATTGG 522 0 

AGATTGAAAG ATGCTGTAAT TTAGAAATTA ACATGATATC TTAAATTACC TTTATGAAAT 52 80 

ATAGTTTTGT ATAATAGCAT AGATTTTCCT TCAAAAAATG AACATTTATA TATCTACAAA 534 0 

AATATGGAGA AGAGCAATTT GAAAGCCTAC TTTCTGAAGA AAATGGTGGG ATTTTTTTTT 5400 

ATCATGATTA AATATCAAAA AATTGCCCTA TGAAAACTTT AAATCTCTAA AACATTTGAA 5460 

ATACTACCAT ATTTGTGATT TATTGAGAAT AAAAATCCAT TTTGAAATGT AAAATTTTTA 552 0 

TGATCTGATT CAGTTTTAAG AAAACATGAA TGAACTAGAA GATATTAAAA ACATTTGACA 5580 

TTGGTAAGAA ATATTGATAC TGATATTGAT TTTTATATAG GTATTTATTT CAGAATTGAT 564 0 

ATTTTGAGAA aaATACATGT GAGTCATTTT TTCTGTTTCT CTTTTCTCTT AACGATTATC 5700 
ACTGTAATTC TGAATCT 



1 11 21 31 41 51 

] 1 1 1 I I 

MAFPPRRRLR LGPRGLPLLL SGLLLPLCRA FNLDVDSPAE YSGPEGSYFG E 

SSRMFLLVGA PKANTTQPGI VEGGQVLKCD WSSTRRCQPI BFDATGNRDY AKDDPLEFKS 

HQWFGASVRS KQDKILACAP LYHWRTEMKQ EREPVGTCFL QDGTKTVEYA PCRSQDIDAD 

GQGFCQGGFS IDFTKADRVL LGGPGSFYWQ GQLISDQVAE IVSKYDPNVY SIKYNNQLAT 
RTAQAIFDDS YLGYSVAVGD FNGDGIDDFV SGVPRAARTL GMVYIYDGKN MSSLYNFTGE 
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QMAAYFGFSV AATDINGDDY ADVFIGAPLF MDRGSDGKLQ EVGQV3VSLQ RASGDFQTTK 3S0 

LNGFEVFARF GSAIAPLGDL DQDGFNDIAI AAPYGGEDKK GIVYIFNGRS TGLNAVPSQI 420 

LEGQWAARSM PPSFGYSMKG ATDIDKNGYP DLIVGAFGVD RAILYRARPV ITVNAGLEVY 480 

PSILNQDNKT CSLPGTALKV SCFNVRFCLK ADGKGVLPRK LNFQVELLLD KLKQKGAIRR 540 

5 ALFLYSRSPS HSKNMTISRG GLMQCEELIA YLRDESSFRD KLTPITIFME YRLDYRTAAD 500 

TTGLQPILNQ FTPANISRQA HILIiDCGEDN VCKPKLEVSV DSDQKKIYIG DDNPLTLIVK 650 

AQNQGEGAYE AELIVSIPLQ ADFIGWRNN EALARLSCAF KTENQTRQW CDLGNPMKAG 720 

TQLLAGLRFS VHQQSEMDTS VKFDLQIQSS NLFDKVSPW SHKVDLAVLA AVEIRGVSSP 780 

DHIFLPIPNW EHKENPETEE DVGPWQHIY ELRNNGPSSF SKAMLHLQWP YKYNNNTLLY 840 

10 ILHYDIDGPM NCTSDMEINP LRIKISSLQT TEKNDTVAGQ GERDHLITKR DLALSEGDIH 900 

TLGCGVAQCL KIVCQVGRLD RGKSAILYVK SLLWTETFMN KEKQNHSYSL KSSASFNVIE 9S0 

FPYKNLPIED ITNSTLVTTN VTWGIQPAPM PVPWVIILA VLAGLLLLAV LVFVMYRMGF 1020 
FKRVRPPQEE QEREQLQPHE NGEGNSET 

15 Seq ID NO: 190 DMA sequence 

Nucleic Acid Accession #: NM_004864 

Coding sequence: 25-952 (underlined sequences correspond to start and stop codons) 
1 11 21 31 41 51 

20 | | | | | | 

CGGAACGAGG GCAACCTGCA CAGCCATGCC CGGGCAAGAA CTCAGGACGG TGAATGGCTC 50 
TCAGATGCTC CTGGTGTTGC TGGTGCTCTC GTGGCTGCCG CATGGGGGCG CCCTGTCTCT 120 
GGCCGAGGCG AGCCGCGCAA GTTTCCCGGG ACCCTCAGAG TTGCACTCCG AAGACTCCAG 180 
ATTCCGAGAG TTGCGGAAAC GCTACGAGGA CCTGCTAACC AGGCTGCGGG CCAACCAGAG 240 

25 CTGGGAAGAT TCGAACACCG ACCTCGTCCC GGCCCCTGCA GTCCGGATAC TCACGCCAGA 300 
AGTGCGGCTG GGATCCGGCG GCCACCTGCA CCTGCGTATC TCTCGGGCCG CCCTTCCCGA 3S0 
GGGGCTCCCC GAGGCCTCCC GCCTTCACCG GGCTCTGTTC CGGCTGTCCC C 
AAGGTCGTGG GACGTGACAC GACCGCTGCG GCGTCAGCTC AGCCTTGCAA G 
GCCCGCGCTG CACCTGCGAC TGTCGCCGCC GCCGTCGCAG TCGGACCAAC 1 

30 ATCTTCGTCC GCACGGCCCC AGCTGGAGTT GCACTTGCGG CCGCAAGCCG CCAGGGGGCG 500 
CCGCAGAGCG CGTGCGCGCA ACGGGGACGA CTGTCCGCTC GGGCCCGGGC GTTGCTGCCG 550 
TCTGCACACG GTCCGCGCGT CGCTGGAAGA CCTGGGCTGG GCCGATTGGG TGCTGTCGCC 720 
ACGGGAGGTG CAAGTGACCA TGTGCATCGG CGCGTGCCCG AGCCAGTTCC GGGCGGCAAA 780 
CATGCACGCG CAGATCAAGA CGAGCCTGCA CCGCCTGAAG CCCGACACGG AGCCAGCGCC 840 

35 CTGCTGCGTG CCCGCCAGCT ACAATCCCAT GGTGCTCATT CAAAAGACCG ACACCGGGGT 900 
GTCGCTCCAG ACCTATGATG ACTTGTTAGC CAAAGACTGC CACTGCAT AT GA GCAGTCCT 950 
GGTCCTTCCA CTGTGCACCT GCGCGGGGGA GGCGACCTCA GTTGTCCTGC CCTGTGGAAT 1020 
GGGCTCAAGG TTCCTGAGAC ACCCGATTCC TGCCCAAACA GCTGTATTTA TATAAGTCTG 1080 
TTATTTATTA TTAATTTATT GGGGTGACCT TCTTGGGGAC TCGGGGGCTG GTCTGATGGA 1140 

40 ACTGTGTATT TATTTAAAAC TCTGGTGATA AAAATAAAGC TGTCTGAACT GTTAAAAAAA 1200 
AAAA 

Seq ID NO: 191 Protein sequence: 
Protein Accession #: NP 004855 

45 

1 11 21 31 41 51 

I I I I I I 

MPGQELRTVN GSQMLLVLLV LSWLPHGGAIi SLAEASRASF PGPSELHSED ERFRELRKRY SO 
EDLLTRLRAN QSWEDSNTDL VPAPAVRILT PEVRLGSGGH IiHLRISRAAL PEGLPEASRL 120 
50 HRALFRLSPT ASRSWDVTRP LRRQLSIiARP QAPALHLRLS PPPSQSDQLL AESSSARPQL 1B0 
ELHLRPQAAR GRRRARARNG DDCPLGPGRC CRLHTVRASIi 3DLGWADWVL SPREVQVTMC 240 
IGACPSQFRA ANMHAQIKTS DHRLKPDTEP APCCVPASYN PKVLIQKTDT GVSLQTYDDL 300 
LAKDCHCI 

55 Seq ID NO: 192 DNA sequence 

Nucleic Acid Accession #: XM_061731.1 

Coding sequence: 1-567 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

60 | | I I I I 

ATGAGAAAAG GAAATGAGGG AGAGAACACA GAAGAGGGCA GGCTTGCTCA GCTTGCTCAA 

AGAAAGTTTC T CAAAG AAGA TGGCATTACA TTGCACATCT CTCTGTGTCT C " 

GTAAAAGAAC CTTTCTCTCT GATTGGACTT GACACACAGA AGGATCTCAG 1 
CTGTTGTTGA TGTCCACAGA CACTGGCAAG GACAGGTTTA CCAACATACT GCTGTCACAC 

65 TCCCCTCCAA TGTGCACCAA ATCACGTAAA AATGGGGATA ATGACTCCCC TGCCTTCACA 
TGGGGTGGCA AAGACACCAG GAGCAATACT GATCTTCCTA TCAGAGACCC TGGGGGCAAG 
AGTCTTTCAC TCACCAAACA TTCCCACAAG CCTGTCCCTG AGCATCAGTG TGACCAGAGA 
GAGGTCTTCC AGCCACTTTC AGAGCCAGGT GTAGAAGCAG AGATGGAAGT GTTCGCTGAT 
GCTGGATGGT GGATTTATCA GAGCTGTCAG GTTCCTTCCT CAACCCTTGC AAGAAAGAAG 

70 ATGGTTTATT CTAAAGAAAC TGAGTGA 

Seq ID NO: 193 Protein sequence.- 
Protein Accession #: XP_061731.1 
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MRKGNEGENT EEGRLAQLAQ RKFIiKEDGIT LHIS1CLSIA VKEPPSLIGL DTQKDLSKDL 
LLLMSTDTGK DRFTNILLSH SPPMCTKSRK NGDNDSPAFT WGGKDTRSNT DLPIRDPGGK 
SLSLTKHSHK PVPEHQCDQR EVFQPLSEPG VEAEMEVFAD AGWWIYQSCQ VPSSTIiARKK 
MVYSKETE 

Seq ID NO: 194 DMA 3 
Nucleic Acid Accessi 

Coding sequence: 371-2410 (underlined sequences correspond to start and stop codons) 



CCAGACAACA 
CGTAGTTTAC 
CTCATATTCT 
ACTCCAGAGA 
TTTGGTGGAC 
CTCCGTGGGA 
GACCCTGAAG 



ACTCGTGGCT 
TATTGGTTTC 
AATTGTGATG 
CTTCCTGGTT 
TTTGCCAGTT 
AGCACCGTTG 
ATGTGCAGTT 
AATTGAACGA 
CTTGAAAGAA 
TGTTTCTGAG 
CTCATTCAAA 
CTTGAAAGAG 
GAACCTTGTC 
GTATCACACC 



CGGTGCCGCC 
TCCTCGTCTC 
CGCCATGGAA 
GATGCCCATA 
AGTATTTAAT 
GTTTACACAT 
ATGGCAACGC 
TACCTATGGA 
GCCAATGATG 
CAAGCCTGCA 
GTGAGCGAAA 
CTACTGATGG 
TCGTTTTTGA 
TCCCTCGTGG 
TCTTGGTTCG 
CGTGCATTCA 
TTCTATGCCT 
CTGGGCTTTG 
TTCTGTGCCC 
GAAATAAAGT 



GACCCGGGCC 
CCGCTCCGCC 
TTCTGCTCCG 



GTAGGGCCTG 
CTTGGAGATT 
GAAACCAGCA 
CAGTTCAGTC 



CTATACTTCC 
AGGTGAACAG 
GCGAATTCGA 
ATCTGAGATA 



GGTTTGGGGA 
TAGTGGCTTC 
CCTTCCCATC 
GTCCAAGAAG 
AGTCCCCATT 



TCCTGCTCCC 
TGGGAGCAGA 
GTCTCAAAAT 
TCCTTCTGGG 
CATTATGTTT 
CATGAAGAGC 
ACATGCACAG 
AAGTAGAGTC 
GGGAGCTTCT 
TTTGCAAGCA 
GCAATCTTGG 
GTGAACTTTG 
GAATAAAAAA 



GGAGATTGCA 
TATACCATGG 
AAGGGCGAAG 
ATGGACAGTT 
GACATGAGTG 
GAAGAATGGT 
CTTACAGCCT 
CCTCTGGTTG 
CCAATATGGC 
AGAAGAGTTA 
AGTATTGAAC 
AGTACAACAC 
GCTGTTGACT 
TCTGGAGTTA 
AGCTGTTTGA 
CTGAAGAATG 



TTTATATAAT 
CTTGAAAGGC 
TGATTACCAG 
TGCTCATCCT 
TAGCAAATTC 
TCCTAGCTAG 
CCATCCGGAA 
CCGGCTCAGT 
AGCTCCCTAT 
CAAAGGGGCA 
TGTCCCCACT 
TCCTCCATAA 
GCACAGTTGG 
ACAAACTTCC 
TTATCGTCTG 
GTAGTCCTTC 
AAACAAAGTT 
CCACTGTGCC 
TGGAGGAAGC 
TAGATAGCAC 
AAGCCGTCAG 
ATTCCGGCCT 
TGGGAGACTC 
CAATATGTGG 
AAATGGAGAA 
ACACCAGTTA 
TCAAGGCAGG 



31 
I 

GTGCCGTGTG 
CTCCCTTTTC 
TGCTTTTAGC 
AGCAGTAACT 
ATATATTATT 
GCTCAGTAGT 
TACTACAGCT 
GGGCTTCATT 
TTTTGGTACA 
CATCTTTGAA 
GGGCTTGATT 



TTCTGGAACC 
GGAGGGTGTC 
GCTTTCTGGA 



AATAAACCTC 



GTTCTTTGTA 
TGAAAGCCCC 
GTCTGTTGGT 
CCTCCAGGCT 
TCCAGAGAGA 
CGTGAATGGT 
CAACCAAATA 
GTACAAAGAG 
CGGTGACAAA 
CATGCCTCTG 



CCCGTGGCTC 
CCTGGATGAA 
CCTCCTGAGC 
CCCCAGCTCG 
TATTATAGCA 
TCTCTTACTA 
GCTACCGCCG 
ATTGCATTTG 
GCTGTGGGCT 
ACAGTGGGCT 
GACGTGGAGA 
TTTGGTTCTG 
CATTGTATTG 
AAGTGGTCTG 
ATTATGTCTG 
GTTCCTAATG 
TTTTCCATCA 

ACCATCCTCA 
TGTCCCAGGA 
TTAATGGAAA 



GTGGTGGAGG 
GAGAGGCTTC 
GCAGTGCAGT 

AACTCCAGTG 



CAGCCGCTGC 
CTTGCGTCCT 
CAAAGAAACC 
GTTTCTGTGC 
TTTTTGATAC 
AACAACCACT 
CTTCTGGTCC 
TCTTGGCATT 
CAGGTGTAGT 
CTGTCTTACT 
TGTACAACTC 
CTGTGTGGCA 
TTGGTGCAAC 
AACTGATAAA 
GAATTTTATT 
GTTTGCGAGC 
TGTATACTGG 

TGAAGAGAAA 
AAAAGAATAG 
ACAAGCATCC 
AGAGAACAGT 



GCTTTGGGTC 
CTTTATATTT 
TTCTACTCTA 
TCCAGACCAT 



CTGCAATGCT 
GATGGGTCTA 
TAAGCCTGAA 
ATTCGCCCAT 
GGTTTATGAC 
TGGTGGTGTT 



GATTCATTCC 
CCTAATGCAG 
GTGTCTGACC 
GGTGACAGAA 
GTCTCTCTCC 
GGTGGCAATG 



GCCACTCCCA 
AATTACATCT 
GCAATAATAG 
GTGCCAAAGA 



GGTATCTGTG 



TTCACTCAGC 
AAGGAAGTAA 
TCTTCCAGTT 
ACGTAAGCAA 
TTTCTTCAAA 
TTGGTCTGTG 



CCTCACTGTG 
GGGCTCTGTT 
TCGTAACATT 



TAGCTGTGTA 
CTGTGAATTC 
TAATGTTGTC 
CGTTTGACAG 
GGATTTAACA 
CTTGGTACTC 
TAGAGGGATG 
GTTTATTGAC 
TTATTTCTTT 
GGCAAGTTAA 
GCCTACAGTT 



AAATAGCCCG 
CTGTACATAT 
TCTGAAGATG 
AGCATGCTCT 
ACAAAAATAT 



AGGTTCTTTG 



TGTGTCAATG 
TAACAGAAGA 
CTATAACTGC 
GGTTCCACTG 
TTCTCTACTT 
ACTTGTGATT 

AACTACAACT 
TCAGTAGTGG 
AACACAGTGA 
AAGAAGAAGT 
GCAGTGTGGG 
CTTCCATGTT 
ACCCGAATTC 



GGTTTGTCAC 
ATGTCATCCT 
TCTTAGGTAT 
TCTTTTTATT 



GTGATTGCAT CAAATATTGG 
GTGTCTGTTG 
TTTATGGCCT 
ATCTTCAGAT 
TTTGGGACCA 
CTGACAAGAG 
TTTTGTGCTA 
GCTCCTGCTG 
TTTGTATCAG 
TTTTTTTCTT TTTTTTAAAC 
TTTCACCAGC TTCTGCCCTC 
TCTCTTATAT 
TGGCATATTC 
TAGTAACTTT 
AAGCCTGTTG 
GAAGTGGAAT 
CCTCTTAACT 



CAGGATCTAT 
AAATTTAAAT 
AAGAAAGAAA 
ATGGATGAAT 
CATTTGTCIA 



1020 
1080 
1140 
1200 



1920 
1980 
2040 
2100 
21(30 
2220 
2280 
2340 
2400 



3000 
3060 
3120 
3180 



11 



21 



I 



MATLITSTTA ATAASGPLVD YLWMLILGFI IAFVLAFSVG AND VANS FGT AVGSGWTLK 
QACILASIFE TVGSVLLGAK VSETIRKGLI DVEMYNSTQG LLMAGSVSAM FGSAVWQLVA 
SFLKLPISGT HCIVGATIGF SLVAKGQEGV KWSELIKIVM SWFVSPLLSG IMSGILFFLV 
RAFILHKADP VPNGLRALPV FYACTVGINL FSIMYTGAPL LGFDKLPLKG TILISVGCAV 
FCALIVWFFV CPRMKRKIER EIKCSPSESP LMEKKNSLKE DHEETKLSVG DIENKHPVSE 
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VGPATVPLQA WEERTVSFK LGDLEEAPER ERLPSVDLKE ETSIDSTVNG AVQLPNGNLV 360 

QFSQAVSNQI NSSGHSQYHT VHKDSGLYKE LLHKLHLAKV GDCKGDSGDK PLRRNNSYTS 420 

YTMAICGMPL DSFRAKEGEQ KGEEMEKLTW PNADSKKRIR MDSYTSYCNA VSDLHSASEI 480 

DMSVKAAMGL GDRKGSNGSL EEWYDQDKPE VSLLFQFIiQI LTACFGSFAH GGNDVSNAIG 540 

PLVALYLVYD TGDVSSKVAT PIWLLDYGGV GICVGLWVWG RRVIQTMGKD LTPITPSSGF 600 

SIELASALTV VIASNIGLPI STTHCKVGSV VSVGWLRSKK AVDWRLFRNI FMAWFVTVPI 660 
SGVISAAIMA IFRYVILRM 

Seq ID NO: 196 DNA sequence 

Nucleic Acid Accession #: NM_000020.1 

Coding sequence: 283-1794 (underlined sequences correspond to start and stop codons) 



AGGAAACGGT 
AGAAACATTT 
GAGCGAGCCC 



TTATTAGGAG 
TTGCTCCAGC 
CTCCCCGGCT 



AGGCTAGCGC 
AGGAAAGGCC 
TCTCGGGGCC 
CGGGGGGCCT 
CGGGGCTGCG 
CACTACTGCT 
CAACCTCCTT 
CTGGCCTTGC 
CAGGAGAAGC 
TCTGAGCAGG 
GGCTCAGGGC 
TGTGTGGGAA 



CGCTGGTGAC 
GGTGCACAGT 
GGAACTTGCA 



CGGAGCAGCC 
TGGCCCTGGT 
AGCGTGGCCT 



AACACAGTAT 
CGCAACTCGA 
GACTTTCTGC 
GCATGCGGCC 
GCCCACCGCG 
GCCGACCTGG 
AACCCGAGAG 
ACGGACTGCT 
GAGATTGCCC 
GATGTGGTGC 
CAGACCCCCA 
ATGATGCGGG 
AAGACACTAC 
AGCACCTGAT 
CTATCTGGGT 
TGCTCGGCCC 



TCCCCTTCCT 
AAGGCCGCTA 
TCTTCTCCTC 
TGCTCAGACA 
GCACGCAGCT 
AGAGACAGAC 
TGGCGCACCT 
ACTTCAAGAG 
GCCTGGCTGT 
TGGGCACCAA 
TTGAGTCCTA 
GCCGGACCAT 
CCAATGACCC 
CCATCCCTAA 
AGTGCTGGTA 
AAAAAATTAG 
TCCTTTCTGC 
AGAGGTAGTG 
CCAGCCCACC 



21 
I 

GGAGTGGTGG 
CCCCATCCCA 
CCAGCCCGGT 
GCGGCCGCGC 
GCAGAGCGGG 
GCTGATGGCC 
CTGCACGTGT 
AGTGCTGGTG 
CAGGGAGCTC 
CCTCTGCAAC 
GGGAACAGAT 
GGCCCTGGGT 
GCACAGCGAG 
GTTGGGGGAC 



AGCTGGGCCA 
GTCCCGGGAG 
CCGGGGCCGC 
GGTGGAGGGG 
CCCAGAGGGA 
TTGGTGACCC 
GAGAGCCCAC 
CGGGAGGAGG 
TGCAGGGGGC 
CACAACGTGT 
GGCCAGCTGG 
GTCCTGGGCC 



TGGCGAAGTG 
GAGGGATGAA 
CGACAACATC 
GTGGCTCATC 



GGCAGGAAGA 
GCTGCCGCGC 
GCCGGACCCC 
AGGTGGCCCC 
CCATGACCTT 
AGGGAGACCC 
ATTGCAAGGG 
GGAGGCACCC 
GCCCCACCGA 
CCCTGGTGCT 
CCCTGATCCT 
TGTGGCATGT 
CCAGTCTCAT 
GTGACTGCAC 
GGCAGGTTGC 



CAGTCCTGGT 
CTAGGCTTCA 
ACGCACTACC 
CATCTGGCTC 



GCACGTGGAG 
CCGCAATGTG 
GATGCACTCA 
GCGGTACATG 
CAAGTGGACT 
CGTGAATGGC 
CAGCTTTGAG 



GCACCCGAGG 
GACATCTGGG 
ATCGTGGAGG 



TCCGGGAGAC 
TCGCCTCAGA 
ACGAGCACGG 
TGAGGCTAGC 
CACAGGGCAA 
GCAACCTGCA 
ATTACCTGGA 
TGCTGGACGA 
CCTTTGGCCT 



CGCTGGAATA 
CAGCTGCGCC 
AGCCCGCCGT 
GGTCCGCCGA 
GGGCTCCCCC 
TGTGAAGCCG 
GCCTACCTGC 
CCAGGAACAT 
GTTCGTCAAC 
GGAGGCCACC 
GGGCCCCGTG 
CCGACGGAGG 
CCTGAAAGCA 
CACAGGGAGT 
CTTGGTGGAG 
TGAGAGTGTG 
TGAGATCTAT 
CATGACCTCC 
CTCCCTCTAC 
TGTGTCCGCG 
ACCAGCCATT 
GTGTTGCATC 
CATCGGCAAC 
GCAGATCCGC 



CCCAAACCCC 
CAACAGTCCA 
CTGCAGGGGG 



GCAGACCCGG 
TCTGCCCGAC 
GAGAAGCCTA 
CTGGGGGGGT 
GTGTGCTGGG 
TACAGCTGGG 



AGGTGGTGTG 
TCCTCTCAGG 
TCACCGCGCT 
AAGTGATTCA 



TGTGGATCAG 
CCTAGCTCAG 
GCGGATCAAG 



1200 
1260 
1320 



1740 
1800 
1860 
1920 



Seq ID NO: 197 Protein 
Protein Accession #: m 



MTLGSPRKGL LMLLMALVTQ G 



? LVTCTCESPH 



I 



LILGPVLALL 
DCTTGSGSGL 
RETEIYNTVL 
RLAVSAACGL 
YLDIGNNPRV 
YRPPFYDWP 
TAL.RIKKTLQ 



AHLHVEIFGT 
GTKRYMAPEV 
NDPSFEDMKK 
KISNSPEKPK 



QVALVECVGK 
ASDMTSRNSS 
QGKPAIAHRD 
LDEQIRTDCF 
WCVDQQTPT 
VIQ 



CTWbVREEG 
LVIiEATQPPS EQPGTDGQLA 
SLILKASEQG DTMLGDLLDS 
WHGESVAVKI FSSRDEQSWF 
EHGSLYDFLQ RQTLEPHLAL 
NLQCCIADLG LAVMHSQGSD 
FGLVLWEIAR RTIVNGIVED 
I PNRLAAD P V LSGLAQMMRE CWYPNPSARL 



RGLHSELGES 
GRYGEVWRGL 
TQLWLITHYH 
FKSRNVLVKS 



Seq ID NO: 198 DNA aeguence 

Nucleic Acid Accession #: NM_003199.1 

Coding sequence: 200-2203 (underlined sequences correspond to start and stop codons) 



21 



31 



41 



51 



I I I I 

CGGGGGGATC TTGGCTGTGT GTCTGCGGAT CTGTAGTGGC GGCGGCGGCG GCGGCGGCGG 
GGAGGCAGCA GGCGCGGGAG CGGGCGCAGG AGCAGGCGGC GGCGGTGGCG GCGGCGGTTA 
GACATGAACG CCGCCTCGGC GCCGGCGGTG CACGGAGAGC CCCTTCTCGC GCGCGGGCGG 
TTTGTGTGAT TTTGCTAA AA TG CATCACCA ACAGCGAATG GCTGCCTTAG GGACGGACAA 
AGAGCTGAGT GATTTACTGG ATTTCAGTGC GATGTTTTCA CCTCCTGTGA GCAGTGGGAA 
AAATGGACCA ACTTCTTTGG CAAGTGGACA TTTTACTGGC TCAAATGTAG AAGACAGAAG 
TAGCTCAGGG TCCTGGGGGA ATGGAGGACA TCCAAGCCCG TCCAGGAACT ATGGAGATGG 
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GACTCCCTAT 
TTTTGTCAAT 
AGAATCAAAC 
CAACCCAGGA 
TAATCCCCGA 



GACCACATGA 
TCCAGAATAC 
TTACAGGGTT 
ACCCTTTCGC 
AGGAGGCCTC 
CCAGGTTTGC 



CCAGCAGGGA 
AAAGTAAAAC 
GCCACCAGCA 
CCACCAAACC 
TTCACAGTAG 
CATCTTCAGT 



CAATAGGGAC 
CTTCTTCATG 
TCAGCCTGGC 
CTGTAGCCTG 
TTCCAGTCTT 



CAAGATGGCC 



AGCCGGCAGC 
AGATCACACT 
TCTCTCAGCA 
TTATGAAGGA 
TGATGCTATT 
TCATGGGGAC 
CTCAGGGTAT 
TCGTGAAGAT 
TCCACAGCTT 
CAGAGGCATG 
ATCCGATGAC 
AGATGACGAC 
ACCAGAGCAG 
TCTGCGGGTC 
CCTCAAGAGT 
CCTCAGTCTG 



CATCCACATG 
CCTCCGATGT 
CCTCCTGCCA 



AACAACAGCT 
GGCACAGCTG 
CCCTTACACT 
CATGTTCTCC 
ATGCATGGAA 



GGCGTGGCCC 
CCTGTCCAGT 
CCACCAGGAC 
GAGGGTGATG 



CCCTGGAATG 
CATTGCTTCA 
AAACACTTCT 
GCAGACGCAA 
AGAAAAAAGT 
AGTATGTACA 



AAGGCAGAGC 
CGTGACATCA 
GACAAGCCCC 
GAGCAGCAAG 
GAAGAGAAGG 
GGAGACGCAT 
TTAAAACAAG 
CCTTAACCCC 
G AGGTTT C AG 
GCAACTTGAG 
GGCTGAGACA 



ATCACAGCAG 
TGTTGGGCAA 
AACGTTTGAG 
CCACTTTCCA 
ACGGGACAGA 
GAGATGCTCT 
TTTCATCAAA 
TTTGGTCTAG 
CTTTGCAAAG 
GGAACCATGC 
TCATTGGACC 
TTCTTTCAGC 
TGAGAGGCAG 
CTGCGACTTC 
TACAGGGGCA 
AGAACCTGCA 
TCAAATCAAT 
GTGAGAAGGA 
ACGAGGCTTT 
AGACCAAGCT 
TCCGAGAAAG 
TGTCCTCGGA 
CGAATCACAT 
AGACCACTTC 
CATTTTTGTA 
CATTCCCAAT 
GGACGACTTT 
CAGCCCAGAG 



CCTTGGGTCA 
AGAAAGGGGC 
GAGTCTCCTT 
TGGTTCCCAG 
TGCCATGGAG 
CTATGCTCCA 
CAAACCAGCA 
TGACCCTTGG 
CTCTTCTCAT 
CTATCCATCA 
TCGTAGTGGT 
CAGTATAATG 
GGGGAAAGCA 
CCCTTCAACT 



CCGAATTGAA 



TTCTCATAAT 
CAACAGACAT 
CCATTCTCTT 
CCCTGACCTG 
GAGTGTCTCC 
AGACACGAAA 
TACTAGCAAT 
GCGGAGGATG 
CAAAGAGCTC 
CCTGATCCTC 
GAATCTGAAT 
GCCTCCCCCT 



CATGACAATC 
TCATACTCAT 
GGAGGTGACA 
TACTATCAGT 
GTACAGACAA 
TCAGCAAGCA 
ACCAGCACTT 
AGCTCCTCCA 
ATTCCACAGT 
CACTCCTCAG 
ACAAACCATT 
GCAAATAGAG 
CTTGCTTCGA 
CCTGTTGGCT 
CAGGCCTCAT 
GATCGTTTAG 
TCCACAGCTA 



TCTCTCCACC 
CTTATGGGAG 
TGGATATGGG 



TCACTCATGG 
CTGCCAAACC 
AACCCACCCC 
TCTGGCAGCT 
TCTTCGGAGG 
AATGACGATG 
GCCAACAATG 



AGAAAGTTCG 
CTGCCGACTA 
TCCCTAGCTC 
GTGGGATGAA 
CCAGCAGCTA 
CAGACATCAA 
ACAGCACCTC 
GAAGCGGGGC 
TCTATTCTCC 
CTCCTCCATC 
CGTCTCCTAA 
AAAGACTGGA 
TGCCTGGTGG 
GTGGTCTGGG 
TGGGGACCCA 
AGGTTCCGGT 
AGGACCCTTA 



CTTAACAGCT 
ATATAAGACA 
TATCAAAAAA 
CTTTAACATA 
AGTGAACGGC 



CACCAGGCGG 
CCGAAAGCTG 
CXCTCCTTGG 
TAAAAGGGTC 
GTATTATCTT 



CAGAAAAACA 



ACAAGAAATT 
AGGACCTGAC 
CCCGAGAGCG 
TGCAGCTCCA 
TGGCCGTCAT 
CGTGTCTGAA 
CCGGCCCACA 
CAAGTTGCCA 
AAACCCACAT 
GTTATGAATC 
AAAAAAAGAA 
TGTGCAAAGC 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1S20 
1680 
1740 
1800 
1B60 
1920 
1980 
2040 
2100 
2160 
222C 
2280 
2340 
2400 
2460 



21 



51 



.1 I 

MHHQQRMAAL GTDKELSDLL DFSAMFSPPV SSGKNGPTSL ASGHFTGSNV EDRSSSGSWG 60 

NGGHPSPSRN YGDGTPYDHM TSRDLGSHDN LSPPFVNSRI QSKTERGSYS SYGRESNLQG 120 

CHQQSLLGGD MDMGNPGTLS PTKPGSQYYQ YSSNNPRRRP LHSSAMEVQT KKVRKVPPGL 180 

PSSVYAPSAS TADYNRDSPG YPSSKPATST FPSSFFMQDG HHSSDPWSSS SGMNQPGYAG 240 

MLGNSSHIPQ SSSYCSLHPH ERLSYPSHSS ADINSSLPPM STFHRSGTNH YSTSSCTPPA 3 00 

NGTDSIMANR GSGAAGSSQT GDALGKALAS IYSPDHTNNS FSSNPSTPVG SPPSLSAGTA 360 

VWSRNGGQAS SSPNYEGPLH SLQSRIEDRL ERLDDAIHVL RNHAVGPSTA MPGGHGDMHG 420 

IIGPSHNGAM GGLGSGYGTG LLSANRHSLM VGTHREDGVA LRGSHSLLPN QVPVPQLPVQ 480 

SATSPDLNPP QDPYRGMPPG LQGQSVSSGS SEIKSDDEGD ENLQDTKSSE DKKLDDDKKD 540 

IKSITSNNDD EDLTPEQKAE REKERRMANN ARERLRVRDI NEAFKELGRM VQLHLKSDKP 600 

QTKLLILHQA VAVILSLEQQ VRERNLNPKA ACLKRREEEK VSSEPPPLSL AGPHPGMGDA 660 



Nucleic Acid Accession 11 *: BC005987 (1-1286), BE888744 (1287-1756) 

Coding sequence: 124-525 (underlined sequences correspond to start and stop codons) 



I 

GGCAGAAGAG 
GAGAATTGCA 
CAACTAAAAT 
GAAGACAAAG 
AACCTACTGG 
CGTAAAGCTG 
GTCACCTGGG 
ATTTATGTAG 
AGTCCAGAGC 



ACCTCTGGAC 
ATTGACCCTC 
CTGGCTCTGA 
TTAGTTGAAG 
AAGTTTTATC 
GAATACATAC 



GAAGATTTCT 
CTGCAACCAT 
GCCATTTCAC 
TATTTTACCG 
CCTATCTAAA 
AAGAGTTAAT 
GAAACTATGC 
ACAAGGTGAA 
TTGACTGTGA 
AGGTGTGCTT 
TGGCAATAGC 
TGAGGCAAGC 
AGCTTCATAA 
AAGCCTTGGA 
GAAGAAAAGA 
CAAACAATGC 



h GCTGCCTGAA 



GACTGAGTTT 
GCACCTCAAA 
CCAGCAAGAG 
CTGGGTCTAC 
ACATGTCTGT 
GGAAGGGTGG 
TGAGAAGGCT 
AAGCTACCGT 
CATTCGGCTG 
GATGCGTGAA 
GAAAGCCCCA 
TGAGCCAGAC 
CTACCTGCAT 



CAGAATCGTG 
GGGCAAAACG 
CATGCTGACC 
TATCACATGG 
GAGAAGTTTT 
ACACGGTTAA 
CTGGAAAAGA 
CTGGACAACT 
AATCCTGACA 
GAAGGTGAAG 
GGTGTAACAG 
AAAGCGATTG 
TGCCAAATTG 



41 

I 

CCGAGCCCTG 
CCTTGGAGAG 
AAAACTCCTT 
AATTCAAAGC 
AGGCAGCCCT 
AGGCAGAAAT 
GCCGACTCTC 
CCAGTCCCTA 
AGTGTGGARG 
AGCCAAAGAA 



ACCAGTACCT 
AGGAAGGTGA 
ATGTACTTCG 
AACTGCTTAA 
GGTGCTGCTA 



CCGAACAGCT 
CAGCCTACGG 
GGATGATTTT 
CACAATGTGC 
GGAATGCTTA 
CAGAAGTCTG 
AGACGTTCAG 
TAGAATTGAG 
AAACCAAAAT 
CCCAGAATTC 
TCAGAACGCC 
TAAAGTCCTC 
AGGAGAGAAG 
CAGTGCAGCC 
AAAGGCTTTA 
TAGGGCAAAA 
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GTCTTCCAAG TAATGAATCT AAGAGAGAAT GGAATGTATG GGAAAAGAAA GTTACTGGAA 102 0 

CTAATAGGAC ACGCTGTGGC TCATCTGAAG AAAGCTGATG AGGCCAATGA TAATCTCTTC 1080 

CGTGTCTGTT CCATTCTTGC CAGCCTCCAT GCTCTAGCAG ATCAGTATGA AGAAGCAGAG 1140 

TATTACTTCC AAAAGGAATT CAGTAAAGAG CTTACTCCTG TAGCGAAACA ACTGCTCCAT 1200 

CTGCGGTATG GCAACTTTCA GCTGTACCAA ATGAAGTGTG AAGACAAGGC CATCCACCAC 1260 

TTTATAGAGG GTGTAAAAAT AAACCAGAAA TCAAGGGAGA AAGAAAAGAT GAAAGACAAA 1320 

CTGCAAAAAA TTGCCAAAAT GCGACTTTCT AAAAATGGAG CAGATTCTGA GGCTTTGCAT 138 0 

GTCTTGGCAT TCCTTCAGGA GCTGAATGAA AAAATGCAAC AAGCAGATGA AGACTCTGAG 1440 

AGGGGTTTGG AGTCTGGAAG CCTCATCCCT TCAGCATCAA GCTGGAATGG GGAATGAAGA 1500 

ATAGAGATGT GGTGCCCACT AGGCTACTGC TGAAAGGGAG CTGAAATTCC TCCACAAGTT 1560 

GGTATTCAAA ATATGTAATG ACTGGTATGG CAAAAGATTG GACTAAGACA CTGGCCATAC 162 0 

CACTGGACAG GGTTATGTTA AACCTGAATT GCTGGGTCTT AAAAGAGCCC AAGGAGTTCT 1680 

GGGAGAGGGA CAGATTGGGG GGTCGTCCAG GGCTGCGCTA AATTATTCTC AATGATTTGT 1740 
CTCTTTGCGG AACTTC 



1 11 21 31 41 51 

I I I I I I 

MSENNKNSLE SSLRQLKCHF TWNLMEGENS LDDFEDKVFY RTEFQNREFK ATMCNLLAYL 60 

KHLKGQNEAA LECLRKAEEL IQQEHADQAE IRSLVTWGNY AWVYYHKGRL SDVQIYVDKV 120 

KHVCEKFSSP YRIESPELDC EEGWTRLKCG GNQNERAKVC FEKALEKKPK NPEFTSGLAI 1B0 

ASYRLDNWPP SQNAIDPLRQ AIRLNPDNOY LKVLLALKLH KMREEGEEEG EGEKLVEEAL 240 

EKAPGVTDVL RSAAKFYRRK DEPDKAIELL KKALEYIPNN AYLHCQIGCC YRAKVFQVMN 300 

LRENGMYGKR KLLELIGHAV AHLKKADEAN DNLFRVCSIL ASLHAIiADQY EDAEYYFQKE 360 

FSKELTPVAK QLLHLRYGNF QLYQMKCEDK AIHHFIEGVK INQKSREKEK MKDKLQKIAK 420 
MRLSKNGADS EALHVLAFLQ ELNEKMQQAD EDSERGLESG SLIPSAESWN GE 

Seq ID NO: 202 DNA sequence 
Nucleic Acid Accession #: NM_003090 

Coding sequence: 57-824 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GAATTCCGCG GGAGGCCACG GGCTTTCCAC AGCGCGGGGG AACGGGAGGC TGCAGGATGG 60 

TCAAGCTGAC GGCGGAGCTG ATCGAGCAGG CGGCGCAGTA CACCAACGCG GTGCGCGACC 120 

GGGAGCTGGA CCTCCGGGGG TATAAAATTC CCGTCATTGA AAATCTAGGT GCTACGTTAG 180 

ACCAGTTTGA TGCTATTGAT TTTTCTGACA ATGAGATCAG GAAACTGGAT GGTTTTCCTT 240 

TGTTGAGAAG ACTGAAAACA TTGTTAGTGA ACAACAACAG AATATGCCGT ATAGGTGAGG 300 

GACTTGATCA GGCTCTGCCC TGTCTGACAG AACTCATTCT CACCAATAAT AGTCTCGTGG 360 

AACTGGGTGA TCTGGACCCT CTGGCATCTC TCAAATCGCT GACTTACCTA AGTATCCTAA 420 

GAAATCCGGT AACCAATAAG AAGCATTACA GATTGTATGT GATTTATAAA GTTCCGCAAG 480 

TCAGAGTACT GGATTTCCAG AAAGTGAAAC TAAAAGAGCG TCAGGAAGCA GAGAAAATGT 540 

TCAAGGGCAA ACGGGGTGCA CAGCTTGCAA AGGATATTGC CAGGAGAAGC AAAACTTTTA 600 

ATCCAGGTGC TGGTTTGCCA ACTGACAAAA AGAGAGGTGG GCCATCTCCA GGGGATGTAG 650 

AAGCAATCAA GAATGCCATA GCAAATGCTT CAACTCTGGC TGAAGTGGAG AGGCTGAAGG 720 

GGTTGCTGCA GTCTGGTCAG ATCCCTGGCA GAGAACGCAG ATCAGGGCCC ACTGATGATG 780 

GTGAAGAAGA GATGGAAGAA GAC ACAGT C A CAAACGGGTC CTGAGCAGTG AGGCAGATGT 840 

ATAATAATAG GCCCTCTTGG AACAAGTCTT GCTTTTCGAA CATGGTATAA TAGCCTTGTT 900 

TGTGTTAGCA AAGTGGAATC TATCAGCATT GTTGAAATGC TTAAGACTGC TGCTGATAAT 960 

TTTGTAATAT AAGTTTTGAA ATCTAAATGT CAATTTTCTA CAAATTATAA AAATAAACTC 1020 
CACTCTCTAT GCTAAAAAAA AAAAAAAGGA ATTC 



Seq ID NO: 203 Protein 



Protein Accession #: NP_003 081.1 

1 11 21 31 41 51 

I I I I I I 

MVKLTAELIE QAAQYTNAVR DRELDLRGYK IPVIENLGAT LDQFDAIDFS DNEIRKLDGF SO 
PLLRRLKTLL VNNNRICRIG EGLDOALPCL TELILTNNSL VELGDLDPLA SLKSLTYLSI 120 
LRNPVTNKKH YRLYVIYKVP QVRVLDFQKV KLKERQEAEK MFKGKRGAQL AKDIARRSKT 180 
FNPGAGLPTD KKRGGPSPGD VEAIKNAIAN ASTLAEVERL KGLLQSGQIP GRERRSGPTD 240 
DGEEEMEEDT VTNGS 

Seq ID NO: 204 DNA sequence 

Nucleic Acid Accession #: NM_017643.1 

Coding sequence: 169-1401 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

AATAGCAATA GCTTTATAGC AGCTCCGGTT ACCTGTTTTA AACATGGAAG GAGAGTCGCT SO 

CCCAGATAGC CCTCACGAGT GGCCCTGGAG CAGGGAGTGG TGGAGCAGAT CTTCCTTGTT 120 

TGGGAGGAGC CTGAGGTGGA CCTCGCGTCC TGAGTCTGGA AGGCACCTAT_GGGGACCTGC 180 

TGGGGTGATA TCTCAGAAAA TGTGAGAGTA GAAGTTCCCA ATACAGACTG CAGCCTACCT 240 
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ACCAAAGTCT 
TATGAAGGAT 
ATCCATCCAG 
CAGCATAAAT 
CTGCCTCCTG 
ATGAGAGTAG 
AGTGTAATTG 
TTCTGGTGCC 
CATCGATTCA 
CATTTATTTG 
AAATTGGAAG 
GTGCTAGCTG 
GACTGGTTCT 
AACATGATTG 
TACCTCAGGG 
AATCACGGAT 
ATATGTGTAG 
TGGGAAGAAG 
TGGTGTCAGT 
AGAAAAGGTG 
TCTTATGAGC 
AAGACTCAAC 
AGTCAATTAC 
TTTAGTGAGT 
AAATAGTGGA 



TCTGGATTGC 
TTGAAAATGA 
TTGGTTGGTG 
ATACAAACTG 
ATTTCTCCCA 



ATATGCACAG 
AAAGATCTGA 
CTAAGGTAAA 
CTATAGACCC 
ACGGATTCCT 
GTTACCATGC 
AACTTACTCC 
AAACTGGCTC 
TTCGTGTAGG 
CCACAGTAAC 
AGTATGATCA 



TGGAATTGTA 

TGCAGCCAGC 
GAAAGCTTTT 
AAAGGTTTCA 
CAAGAGGCAT 
AAGACTAGTG 
CCCATTAATA 
TATTACAAAG 



AAATTAGCAG GTTACAATGC CCTTTTAAGA 
GACTTCTGGT GCAATATATG TGGTTCTGAT 
GGAAAACCTC TTGTTCCTCC TAGAACTATT 
CTAGTGAAAC GACTTACTGG TGCCAAAACA 
GAGAGTATGC AGTATCCTTT CAAACCTTGC 



TATGAAGAAA 
CATCATATTG 
AAACAGGATG 
CAGAGTGGGG 



GATGATTGGG 
AACCTCTCCT 
ACCCAGAGGT 



AATGAAATTA 
TCGAATTATT 
GTGGGTAGAC 



TCCTTTTGTA. 
TCACAGGACA 
AACAATATCA 
ATAATGACTA 
TAAAAATTAC 
AAATGTCTCA 



ATTGTTAACT 
TGATGATTGT 
TGCAATAATG 
AGTATACGTG 
TGAAAGTTAA 
GGCTCACGCC 
AGATCGAGAC 



ATCACTTGAA 



GTATCGTCTA 
CAAGAGAAAA 
AAGGACATAA 
AATTGGAGTC 
GCATTATGTA 
AGAAACACTG 
GTAAAGAATA 
TCCTTTTTAA 
TGTAATCCCA 
CATCCTGGCT 
TGGTGGCACA 
CCCAGGAGGC 
CAGCAAGACT 



ATCGATGGCT 
TCTATTTTCC 
TACACAAAAC 
CCAGTAAAAC 
GAAGCAGTAG 
CATCGTCTCT 
TGTGAGTCAC 
CCTCCAGCAT 
AAAATCAGCA ATTCTCCAGA 
AGAATATACC TATGTCTGAT 
CAGAATCAGA CCATGTGTCC 
TAGAAACACA ACAGTCACCA 
ATACTAAAAG TTTATTGGTA 
ATGGTTTTGT 
CTTGGTAAAG 
AATTGTAAAC AATTATCTAG 
CCAATCAGCT TCATCAAAAC 
GAAAAGTGGG 
ACAGTATTCT 
TTATGCTTAA 
ATATTTTACT 
GAGTCTGTGA 
AAACTTTATT 
GCACTTTAGG 



GTTGGTCTCG 
GACATTTTGA 
AATGGTTCAA 
GTGTCGCAAC 
CAGAAGCAGC 
CTGTCGGTTT 
TTCCTTTTAA 
TATTTAATAA 
ATCTCATGGA 
TGAGGATACA 
CTGACCTCTA 
CACAGTGTAA 
GGACTATCTC 
TGGTTGCCAG 



AACAGATGAC 
AAGCATAGGT 
TACACCACCA 
GGAAGGAATG 
CATTAGAAAG 
AGACGGATCT 
CTGTGAAATT 
ATGGTTTGAC 



GCCACGTTTA 
TTTTGATGGA 
TCCTGTAGGG 
GTTGGTATAC 
ACATAAGTCA 
GTAAGACATT 



TGGACAGAAA 
AGGTGCAGTA 
AACAGGAGAA 
ATGATTCTTG 



AATTAAACTA 
GGTAATAAAT 
AGGAACAAGT 
ATAGTTCATA 
TACCAATTTT 
AGAAGAAAAA 
GTGTTCACAT 
ATGATATATC 
TGCCATAAAA 
ATGATTACCA 
AAATAATATG 
AATTAGCAGC 



GACTTACTAT 
GCTTTTGAGT 
ACCCTTATTT 
TAAGTTGTAT 
CCCTTTTTAT 
GGCTAAGTCC 
ACATTTTCTA 
TTGTGAGAAC 
GGCAAACCCT 
CAGTATTTAA 
TAAAACCTAC 
CAGGTGCAGT 



CGCCTGAAGT 
AGAGGTTGCA 
CTGTCTCAAA 



AACCCTGTCT CCACCAAAAA TACAAAAAAT 
CCCAGCTACT CAGGAGGCTG AGGCAAGAGA 
GTGGGCCAAG ATCACGCCAC TACATTCCAG 



1200 
1260 
1320 
1380 



2040 
210C 
2160 
2220 
2280 
2340 
2400 
2460 
2S2C 



31 



51 



I I I I I 

MGTCWGDISE NVRVEVPNTD CSLPTKVFWI AGIVKLAGYN ALLRYEGFEN DSGLDFWCNI 

CGSDIHPVGW CAASGKPLVP PRTIQHKYTN WKAFLVKRLT GAKTLPPDFS QKVSESMQYP 

FKPCMRVEW DKRHLCRTRV AWESVIGGR IjRLVYEESED RTDDFWCHMH SPLIHHIGWS 

RSIGHRFKRS DITKKQDGHF DTPPHLFAKV KEVDQSGEWF KEGMKLEAID PLNLSTICVA 

TIRKVLADGF LMIGIDGSEA ADGSDWFCYH ATSPSIFPVG FCEINMIELT PPRGYTKLPF 

KWFDYLRETG SIAAPVKbFN KDVPNHGFRV GMKLEAVDLM EPRLICVATV TRIIHRLLRI 

HFDGWEEEYD QWVDCESPDL YPVGWCQLTG YQLQPPASQC KLVYRKGVBL 



Seq ID NO: 206 DWA sequence 

Nucleic Acid Accession #: NM_012334 

Coding sequence: 223-6399 (underlined sequences c 



respond to start and stop codons) 



GAGACAAAGG 
TGAGAAGGAC 
CGGGAGTGGC 
AGTCGGAGCG 
GAGGGAACAC 
TGTGCAGAAG 
AGCACAATTA 
GACATGGCGT 
TATAAGAGAA 
CAGCCCATCG 



CGCTACGACA 
ACTAAATTGA 
AAGGAGAAGA 
TTCGGCAATG 



11 

1 

CTGCCGTCGG 
AAGAAGGGAC 
GCCGTGACAC 
GCACTCGGCG 
GGGTCTGGCT 
GCATCGTCGT 
CCCACCAGAA 
CCTTGACAGA 
ATCAAATATA 
CCGGGCTGTA 
CCCCGCACAT 
ACCAGTGCAT 
TCCTCAAGTT 
CATCCTGTGT 
CGAAGACCGT 



GACGGGCGAG 
CGGGCGATGG 
GCATGGTTTC 
AGTCCGGGAC 



GGTGACTGCT 
GCTCCATGGC 
TACCTACATC 
CGAGCCTGCC 
CTTCGCCATC 
CCTCATCAGT 
TCTGTCAGTC 
TGAACGAGCT 
GTACAACAAC 



CCCGGACCCG 
TGCGCTGGAA 
GGCCAGCATT 
GACTATGGTC 
ATGCACCCCA 
GGCTCCATCA 
GGCTCCATCC 



GGGTTTGGGC 
GCCCCGCGGG 
CGGCGGCGCT 



GCCAACGAGT 
GGTGAAAGTG 
ATCAGTCAAC 
ATTCTTGAAA 
AACTCTAGTC 



I 

GAACAAAAGG 
CGCGCGTCCT 
GACTTCCGCG 
CTTCTTCACC 
TGTAAATTCC 
TTACAAGCAG 
GGGCGTGGAT 
ATTCCAGCGG 
GAACCCCTAC 
GCGCCACCTG 
CCTGTGGAAG 
AACCGAAAGC 
ATTGTCCTTA 



TGTATAACTT P 
TGGCCTCCGT G 
AGTACAGCCG G 
GCTACCGCTG C 
GGGCAGGTAA A 
AGTCTTTGGA P. 
GCAGCCCCAT C 
GCTTTGGGAA GTTTGTTCAG 900 
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CTGAACATCT GTCAGAAAGG AAATATTCAG 
AAAAACCGAG TAGTAAGGCA AAATCCCGGG 
CTGGCAGGGC TGGAACATGA AGAAAGAGAA 
CACTACTTGA ATCAGTCTGG ATGTGTAGAA 
5 AGGGAAGTTA TTACGGCAAT GGACGTGATG 
TCGAGGCTGC TTGCTGGTAT ACTGCATCTT 
GCACAGGTTT CCTTCAAAAC AGCTTTGGGC 
ACACAGCTCA CAGATGCTTT GACCCAGAGA 
ACGCCTCTCA ATGTTCAACA GGCAGTAGAC 

10 GCGTGCTGCT TTGAGTGGGT AATCAAGAAG 
TTCAAGTCTA TTGGCATCCT CGACATCTTT 
GAACAGTTCA ATATAAACTA TGCAAACGAG 
TTTTCTTTAG AACAACTAGA ATATAGCCGG 
ATAGACAATG GAGAATGCCT GGACTTGATT 

15 AATGAAGAAA GCCATTTTCC TCAAGCCACA 
CAGCATGCGA ATAACCACTT TTATGTGAAG 
AAGCACTATG CTGGAGAGGT GCAATATGAT 
ACATTTCGAG ATGACCTTCT CAATTTGCTA 
CTTTTTGAAC ATGTTTCAAG CCGCAACAAC 

20 CGGCGGCCTA CAGTCAGCTC ACAGTTCAAG 
AGCTCCTCTA ATCCTTTCTT TGTTCGCTGT 
CAGTTTGACC AGGCGGTTGT GCTGAACCAG 
AGAATCCGCA AAGCTGGGTA TGCGGTCCGA 
AAAGTGCTGA TGAGGAATCT GGCTCTGCCT 

25 CTGCAGCTCT ATGATGCCTC CAACAGCGAG 
CGAGAATCCT TGGAACAGAA ACTGGAGAAG 
ATGGTGATTC GGGCCCATGT CTTGGGCTTC 
TATTGTGTGG TGATAATACA GAAGAATTAC 
CACCTGAAAA AGGCAGCCAT AGTTTTCCAG 

30 GTTTACAGAC AATTGCTGGC AGAGAAAAGG 
GAAGAAAAGA AGAAACGGGA GGAAGAAGAA 
GAGCTCCGCG CCCAGCAGGA AGAAGAAACG 
AAGAGCCAGA AGGAAGCTGA ACTGACCCGT 
GTGGAAGAGA TCCTCCGTCT GGAGAAAGAA 

35 CAGGAGCTGT CGCTGACCGA GGCTTCCCTG 
CTCCGCAGGC TGGAGGAGGA AGCGTGCAGG 
TTCGACGAGA TCGACGAGTG TGTCCGGAAT 
TTTTCCAGCG AGCTGGCTGA GAGCGCATGC 
CCCTACCCAG AGGAGGAGGT CGATGAGGGC 

40 TCCCCCAACC CCAGCGAGCA CGGCCACTCA 
GATGACTCTT CAGAGGAGGA CCCATACATG 
GCGGACAGCA CGGTGCTGCT CGCCCCATCA 
TCCAGCGGCG AGTCCACCTA CTGCATGCCC 
GGCGACTACG ACTACGACCA GGATGACTAT 

45 GTGACCTTCT CCAACTCCTA CGGCAGCCAG 
ACCTACAACA GCTCGGGTGC CTACCGGTTC 
GATAGTGAAG AGGACTTTGA TTCCAGGTTT 
GACTCTGTGT ACAGCTGTGT CACTCTGCCG 
GGCCTGATGA ACTCTTGGAA ACGCCGCTGG 

50 TTCCGCTCCA AGCAGGAGGC CCTCAAGCAA 
TCCACGCTGT CCAGGAGAAA TTGGAAGAAG 
ATGTACTTTG AAAACGACAG CGAGGAGAAG 
AAAGAGATCA TAGATAACAC CACCAAGGAG 
ACTTTCCACC TGATTGCAGA GTCCCCAGAA 

55 CAGGTCCACG CGTCCACGGA CCAGGAGATC 
CAGAATGCTG TGGGCACCTT GGATGTGGGG 
CCTGATAGAC CCAACTCGTT TGTGATCATC 
GACACGCCGG AGGAGATGCA CCACTGGATA 
AGAGTGGAGG GCCAGGAATT CATCGTGAGA 

60 CCGAAGATGT CTTCACTGAA ACTGAAGAAA 
GATTACTACA AGAGTTCAGA GAAGAACGCG 
CTCTGCTCTG TCGTCCCCCC AGATGAGAAG 
AC CGTGTACG GGCGCAAGCA CTGTTACCGG 
. CGGTGGTCCA GTGCCATTCA AAACGTGACT 

65 CAGCAGCTGA TTCAAGATAT CAAGGAGAAC 
TACAAGCGGA ACCCGATCCT TCGATACACC 
CTTCCGTATG GGGACATAAA TCTCAACTTG 
GATGAGGCCA TCAAGATATT CAATTCCCTG 
CCAATAATCC AGGGCATCCT ACAGACAGGG 

70 TACTGCCAGC TTATCAAACA GACCAACAAA 
TACAGCTGGC AGATCCTGAC ATGCCTGAGC 
AAGTATCTCA AGTTCCATCT GAAAAGGATA 
AAATACGCTC TCTTCACTTA CGAATCTCTT 
TCCCGAGATG AAATAGAAGC TCTGATCCAC 

75 CATGGCGGCG GCTCCTGCAA GATCACCATC 
GAGAAGCTGA TCCGAGGCCT GGCCATGGAG 



GGCGGGAGAA TTGTAGATTA TTTATTAGAA 960 

GAAAGGAATT ATCACATATT TTATGCACTG 1020 

GAATTTTATT TATCTACGCC AGAAAACTAC 1080 

GACAAGACAA TCAGTGACCA GGAATCCTTT 1140 

CAGTTCAGCA AGGAGGAAGT TCGGGAAGTG 1200 

GGGAACATAG AATTTATCAC TGCTGGTGGG 1260 

AGATCTGCGG AGTTACTTGG GCTGGACCCA 132 0 

TCAATGTTCC TCAGGGGAGA AGAGATCCTC 1380 

AGCAGGGACT CCCTGGCCAT GGCTCTGTAT 1440 

ATCAACAGCA GGATCAAAGG CAATGAGGAC 150 0 

GGATTTGAAA ACTTTGAGGT TAATCACTTT 1560 

AAACTTCAGG AGTACTTCAA CAAGCATATT 162 0 

GAAGGATTAG TGTGGGAAGA TATTGACTGG 168 0 

GAGAAGAAAC TTGGCCTCCT AGCCCTTATC 1740 

GACAGCACCT TATTGGAGAA GCTACACAGT 1800 

CCCAGAGTTG CAGTTAACAA TTTTGGAGTG 1860 

GTCCGAGGTA TCTTGGAGAA GAACAGAGAT 192 0 

AGAGAAAGCC GATTTGACTT TATCTACGAT 198 0 

CAGGATACCT TGAAATGTGG AAG CAAACAT 204 0 

GACTCACTGC ATTCCTTAAT GGCAACGCTA 210 0 

ATCAAGCCAA ACATGCAGAA GATGCCAGAC 2160 

CTGCGGTACT CAGGGATGCT GGAGACTGTG 222 0 

AGACCCTTTC AGGACTTTTA CAAAAGGTAT 2280 

GAGGACGTCC GAGGGAAGTG CACGAGCCTG 2340 

TGGCAGCTGG GGAAGACCAA GGTCTTTCTT 240 0 

CGGAGGGAAG AGGAAGTGAG CCACGCGGCC 246 0 

TTAGCACGAA AACAATACAG AAAGGTCCTT 252 0 

AGAGCATTCC TTCTGAGGAG GAGATTTTTG 2580 

AAGCAACTCA GAGGTCAGAT TGCTCGGAGA 264 0 

GAG C AAGAAG AAAAGAAGAA ACAGGAAGAG 270 0 

AGAGAAAGAG AGAGAGAGCG AAGAGAAGCC 276 0 

AGGAAGCAGC AAGAACTCGA AGCCTTGCAG 282 0 

GAACTGGAGA AACAGAAGGA AAATAAGCAG 2880 

ATCGAGGACC TGCAGCGCAT GAAGGAGCAG 294 0 

CAGAAGCTGC AGGAGCGGCG GGACCAGGAG 3000 

GCGGCCCAGG AGTTCCTCGA GTCCCTCAAT 3060 

ATCGAGCGGT CCCTGTCGGT GGGAAGCGAA 3120 

GAGGAGAAGC CCAACTTCAA CTTCAGCCAG 3180 

TTCGAAGCCG ACGACGACGC CTTCAAGGAC 324 0 

GACCAGCGAA CAAGTGGCAT CCGGACCAGC 3300 

AACGACACGG TGGTGCCCAC CAGCCCCAGT 3360 

GTGCAGGACT CCGGGAGCCT ACACAACTCC 3420 

CAGAACGC7G GGGACTTGCC CTCCCCAGAC 3480 

GAGGACGGTG CCATCACTTC CGGCAGCAGC 354 0 

TGGTCCCCCG ACTACCGCTG CTCTGTGGGG 3600 

AGCTCTGAGG GGGCGCAGTC CTCGTTTGAA 3660 

GATACAGATG ATGAGCTTTC ATACCGGCGT 3720 

TATTTCCACA GCTTTCTGTA • CATGAAAGGT 3780 

TGCGTCCTCA AGGATGAAAC CTTCTTGTGG 384 0 

GGCTGGCTCC ACAAAAAAGG GGGGGGCTCC 3900 

CGCTGGTTTG TCCTCCGCCA GTCCAAGCTG 3960 

CTCAAGGGCA CCGTAGAAGT GCGAACGGCA 4020 

AATGGGATCG ACATCATTAT GGCCGATAGG 4080 

GATGCCAGCC AGTGGTTCAG CGTGCTGAGT 414 0 

CAGGAGATGC ATGATGAGCA GGCAAACCCA 4200 

CTGATTGATT CTGTGTGTGC CTCTGACAGC 4260 

ACGGCCAACC GGGTGCTGCA CTGCAACGCC 4320 

ACCCTGCTGC AGAGGTCCAA AGGGGACACC 4380 

GGATGGTTGC ACAAAGAGGT GAAGAACAGT 444 0 

CGGTGGTTTG TACTCACCCA CAATTCCCTG 4500 

CTCAAACTGG GGACCCTGGT CCTCAACAGC 4560 

ATATTCAAAG AGACAGGCTA CTGGAACGTC 462 0 

CTCTACACCA AGCTGCTCAA CGAGGCCACC 4680 

GACACCAAGG CCCCGATCGA CACCCCCACC 474 0 

TGCCTGAACT CGGATGTGGT GGAACAGATT 4800 

CATCACCCCT TGCACTCCCC GCTCCTGCCC 4860 

CTCAAAGACA AAGGCTATAC CACCCTTCAG 4920 

CAGCAACTGG AGTCCATGTC TGACCCAATT 4980 

CATGACCTGC GACCTCTGCG GGACGAGCTG 504 0 

GTGCCCCACC CCGGCAGTGT GGGCAACCTG 5100 

TGCACCTTCC TGCCGAGTCG AGGGATTCTC 516 0 

CGGGAACAGT TTCCAGGAAC CGAGATGGAA 522 0 

AAGAAAACCA AATGCCGAGA GTTTGTGCCT 528 0 

AGGCAGGAAA TGACATCCAC GGTCTATTGC 5340 

AACTCCCACA CCACTGCTGG GGAGGTGGTG 5400 

GACAGCAGGA ACATGTTTGC TTTGTTTGAA 5460 
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TACAACGGCC ACGTCGACAA AGCCATTGAA AGTCGAACCG TCGTAGCTGA TGTCTTAGCC 552 0 

AAGTTTGAAA AGCTGGCTGC CACATCCGAG GTTGGGGACC TGCCATGGAA ATTCTACTTC 5580 

AAACTTTACT GCTTCCTGGA CACAGACAAC GTGCCAAAAG ACAGTGTGGA GTTTGCATTT 5640 

ATGTTTGAAC AGGCCCACGA AGCGGTTATC CATGGCCACC ATCCAGCCCC GGAAGAAAAC 5700 

CTCCAGGTTC TTGCTGCCCT GCGACTCCAG TATCTGCAGG GGGATTATAC TCTGCACGCT 5750 

GCCATCCCAC CTCTCGAAGA GGTTTATTCC CTGCAGAGAC TCAAGGCCCG CATCAGCCAG 5820 

TCAACCAAAA CCTTCACCCC TTGTGAACGG CTGGAGAAGA GGCGGACGAG CTTCCTAGAG 588 0 

GGGACCCTGA GGCGGAGCTT CCGGACAGGA TCCGTGGTCC GGCAGAAGGT CGAGGAGGAG 5940 

CAGATGCTGG ACATGTGGAT TAAGGAAGAA GTCTCCTCTG CTCGAGCCAG TATCATTGAC 6000 

AAGTGGAGGA AATTTCAGGG AATGAACCAG GAACAGGCCA TGGCCAAGTA CATGGCCTTG 6060 

ATCAAGGAGT GGCCTGGCTA TGGCTCGACG CTGTTTGATG TGGAGTGCAA GGAAGGTGGC 6120 

TTCCCTCAGG AACTCTGGTT GGGTGTCAGC GCGGACGCCG TCTCCGTCTA CAAGCGTGGA 618 0 

GAGGGAAGAC CACTGGAAGT CTTCCAGTAT GAACACATCC TCTCTTTTGG GGCACCCCTG 6240 

GCGAATACGT ATAAGATCGT GGTCGATGAG AGGGAGCTGC TCTTTGAAAC CAGTGAGGTG 6300 

GTGGATGTGG CCAAGCTCAT GAAAGCCTAC ATCAGCATGA TCGTGAAGAA GCGCTACAGC 6360 

ACGACACGCT CCGCCAGCAG CCAGGGCAGC TCCAGGTGAA GGCGGGACAG AGCCCACCTG 6420 

TCTTTGCTAC CTGAACGCAC CACCCTCTGG CCTAGGCTGG CTCCAGTGTG CCATGCCCAG 6480 

CCAAAACAAA CACAGAGCTG CCCAGGCTTT CTGGAAGCTT CTGGTCTGAG GGAGGTGTCT 6540 

CCGAGGATCC TTTTGCCTGC CGCCTTCATT GATCCTGTAT TAAGCTGTCA ACTTTAACAG 6600 

TCTGCACAGT TTCCAAAGCT TTACTACTCT TAGAGGACAC ATGCCTTAAA AAAGGAGGGG 6660 

AGGAACCACG CTGCCACCAA AGCAGCCGGA AGTGCCTTAA CTTGTGGAAC CAACACTAAT 6720 

CGACCGTAAC TGTGCTACTG AAGGGAACTG CCTTTCCCCC TTCTGGGGGA GACTTAACAG 6780 

AGCGTGGAAG GGGGGCATTC TCTGTCAATG ATGCACTAAC CTCCCAACCT GATTTCCCCG 6840 

AATCTGAGGG AAGGTGAGGG AGTGGGAAGG GGGATGGAGA GCTCGAGGGG ACAGTGTGTT 6900 

TGAGCTGGAG TGCTGCGGGC AGCCTTTCTC ATGGAATGAC ATGAATCAAC TTTTTTCTTT 6960 

GTTTCATCTT TTAAGTGTAC GTGCTTGCCT GTTCGTGCAT GTGTTCATAA ACTCAACACT 7020 

TTAATCATGG TTTCATGAGC ATTAAAAAGC AAAGGGAAAA AGGATGTGTA ATGGTGTACA 7080 

CAGTCTGTAT ATTTTAATAA TGCAGAGCTA TAGTCTCAAT TGTTACTTTA TAAGGTGGTT 714 0 

TTATTAACAA ACCCAAATCC TGGATTTTCC TGTCTTTGCT GTATTTTGAA AAACACGTGT 72 00 

TGACTCCATT GTTTTACATG TAGCAAAGTC TGCCATCTGT GTCTGCTGTA TTATAAACAG 7260 

ATAAGCAGCC TACAAGATAA CTGTATTTAT AAACCACTCT TCAACAGCTG GCTCCAGTGC 732 0 

TGGTTTTAGA ACAAGAATGA AGTCATTTTG GAGTCTTTCA TGTCTAAAAG ATTTAAGTTA 7380 

AAAACAAAGT GTTACTTGGA AGGTTAGCTT CTATCATTCT GGATAGATTA CAGATATAAT 744 0 

AACCATGTTG ACTATGGGGG AGAGACGCTG CATTCCAGAA ACGTCTTAAC ACTTGAGTGA 7500 

ATCTTCAAAG GACCCTGACA TTAAATGCTG AGGCTTTAAT ACACACATAT TTTATCCCAA 7560 

GTTTATAATG GTGGTCTGAA CAAGGCACCT GTAAATAAAT CAGCATTTAT GACCAGAAGA 7620 

AAAATAATCT GGTCTTGGAC TTTTTATTTT TATATGGAAA AGTTTTAAGG ACTTGGGCCA 7680 

ACTAAGTCTA CCCACACGAA AAAAGAAATT TGCCTTGTCC CTTTGTGTAC AACCATGCAA 774 0 
AACTGTTTGT TGGCTCACAG AAGTTCTGAC AATAAAAGAT ACTAGCT 



1 11 21 31 41 51 

I I I I I I 

MDNFFTEGTR VWLRENGQHF PSTVNSCAEG IWFRTDYGQ VFTYKQSTIT HQKVTAMHPT 60 

NEEGVDDMAS LTELHGGSIM YNLFQRYKRN QIYTYIGSIL ASVNPYQPIA GLYEPATMEQ 120 

YSRRHLGELP PHIFAIANEC YRCLWKRYDN QCILISGESG AGKTESTKLI LKFLSVISQQ 18 0 

SLELSLKEKT SCVERAILES SPIMEAFGNA KTVYNNNSSR FGKFVQLNIC QKGNIQGGRI 240 

VDYLLEKNRV VRQNPGERNY HIFYALLAGL EHEEREEFYL STPENYHYLN QSGCVEDKTI 3 00 

SDQBSFREVI TAMDVMQFSK EEVREVSRLL AGILHLGNIE FITAGGAQVS FKTALGRSAE 3 60 

LLGLDPTQLT DALTQRSMFL RGEEILTPLN VQQAVDSRDS LAMALYACCF EWVIKKINSR 420 

IKGNEDFKSI GILDIFGFEN FEVNHFEQFN INYANEKLQE YFNKHIFSLE QLEYSREGLV 480 

WEDIDWIDNG ECLDLIEKKL GLLADINEES HFPQATDSTL LEKLHSQHAN NHFYVKPRVA 54 0 

VNNFGVKHYA GEVQYDVRGI LEKNRDTFRD DLLNLLRESR FDFIYDLFEH VSSRNNQDTIi 600 

KCGSKHRRPT VSSQFKDSLH SLMATLSSSN PFFVRCIKPN MQKMPDQFDQ AWLNQLRYS 660 

GMLETVRIRK AGYAVRRPFQ DFYKRYKVLM RNIiALPEDVR GKCTSLLQLY DASNSEWQLG 72 0 

KTKVFLRESL EQKLEKRREE EVSHAAMVIR AHVLGFLARK QYRKVLYCW IIQKNYRAFL 780 

IiRRRFLHLKK AAIVFQKQLR GQIARRVYRQ LLAEKREQEE KKKQEEEEKK KREEEERERE 84 0 

RERREAELRA QQEEETRKQQ ELEALQKSQK EAELTRELEK QKENKQVEEI LRLEKEIEDL 900 

QRMKEQQELS LTEASLQKLQ ERRDQELRRL EEEACRAAQE FL3SLNFDEI DECVRNIERS 960 

LSVGSEFSSE LAESACEEKP NFNFSQPYPE EEVDEGFEAD DDAFKDSPNP SEHGHSDQRT 1020 

SGIRTSDDSS EEDPYMNDTV VPTSPSADST VLLAPSVQDS GSLHNSSSGE STYCMPQNAG 1080 

DLPSPDGDYD YDQDDYEDGA ITSGSSVTFS NSYGSQWSPD YRCSVGTYNS SGAYRFSSEG 1140 

AQSSFEDSEE DFDSRFDTDD ELSYRRDSVY SCVTLPYFHE FLYMKGGLMK SWKRRWCVLK 1200 

DETFLWFRSK QEALKQGWLH KKGGGSSTLS RRNWKKRWFV LRQSKLMYFE NDSEEKLKGT 1260 

VEVRTAKEII DNTTKENGID IIMADRTFHL IAESPEDASQ' WFSVLSQVHA STDQEIQEMH 1320 

DEQANPQNAV GTLDVGLIDS VCASDSPDRP HSFVIITANR VLHCNADTPE EMHHWITLLQ 13 80 

RSKGDTRVEG QEFIVRGWLH KEVKNSPKMS SLKLKKRWFV LTHNSLDYYK SSEKMALKLG 1440 

TLVMSLCSV VPPDEKIFKE TGYWNVTVYG RKHCYRLYIK LLNEATRWSS AIQNVTDTKA 1500 

PIDTPTQQLI QDIKEMCLNS DWEQIYKRN PIDRYTHHPL HSPLLPLPYG DINIiKLLKDK 1560 

GYTTLQDEAI KIFNSLQQLE SMSDPIPIIQ GILQTGHDLR PLRDELYCQL IKQTNKVPHP 1620 

GSVGNLYSWQ ILTCLSCTFL PSRGILKYLK FHLKRIREQF PGTEMEKYAL FTYESLKKTK 1680 

CREFVPSRDE IEALIHRQEM TSTVYCHGGG SCKITINSHT TAGEWEKLI RGLAMEDSRN 1740 

MFALFEYNGH VDKAIESRTV VADVLAKFEK LAATSEVGDL PWKFYFKLYC FLDTDNVPKD 1800 

SVEFAFMFEQ AHEAVIHGHH PAPEENLQVL AALRLQYLQG DYTLHAAIPP LEEVYSLQRL 1860 
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KARISQSTKT FTPCERLEKR RTSFLEGTLR RSFRTGSWR QKVEEEQMLD MWIKEEVSSA 192 0 

RASIIDKWRK FQGMNQEQAM AKYMAL.IKEW PGYGSTLFDV ECKEGGFPQE LW1GVSADAV 198 0 

SVYKRGEGRP LEVFQYEHIIi SFGAPLANTY KIWDERELL FETSEWDVA KLMKAYI SMI 2040 
VKKRYSTTRS ASSQGSSR 

Seq ID NO: 208 DMA sequence 

Nucleic Acid Accession ft: XM_059761.1 

Coding sequence: 124-525 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

1 I I I 1 

CGAAGATCTA TCCAAAATCA AGAAGCCTTT GATTTAGATG TTGCTGTAAA AGAAAATAAA GO 

GATGATCTCA ATCATGTGGA TTTGAATGTG TGTACAAGCT TTTCGGGCCC GGGTAGGAGT 12 0 

GGCATGGCTC TTATGGAAGT TAACCTATTA AGTGGCTTTA TGGTGCCTTC AGAAGCAATT 18 0 

TCTCTGAGCG AGACAGTGAA GAAAGTGGAA TATGATCATG GAAAACTCAA CCTCTATTTA 240 

GATTCTGTAA ATGAAACCCA GTTTTGTGTT AATATTCCTG CTGTGAGAAA CTTTAAAGTT 300 

TCAAATACCC AAGATGCTTC AGTGTCCATA GTGGATTACT ATGAGCCAAG GAGACAGGCG 36 0 

GTGAGAAGTT ACAACTCTGA AGTGAAGCTG TCCTCCTGTG ACCTTTGCAG TGATGTCCAG 42 0 

GGCTGCCGTC CTTGTGAGGA TGGAGCTTCA GGCTCCCATC ATCACTCTTC AGTCATTTTT 48 0 

ATTTTCTGTT TCAAGCTTCT GTACTTTATG GAACTTTGGC TGTGAITTAT TTTTAAAGGA 54 0 

CTCTGTGTAA CACTAACATT TCCAGTAGTC ACATGTGATT GTTTTGTTTT CGTAGAAGAA 600 

TACTGCTTCT ATTTTGAAAA AAGAGTTTTT TTTCTTTCTA TGGGGTTGCA GGGATGGTGT 660 

ACAACAGGTC CTAGCATGTA TAGCTGCATA GATTTCTTCA CCTGATCTTT GTGTGGAAGA 72 0 

TCAGAATGAA TGCAGTTGTG TGTCTATATT TTCCCCTCTC AAAATCTTTT AGAATTTTTT 78 0 
TGGAGGTGTT TGTTTTCTCC AGAATAAAGG TATTACTTTA G 



Seq ID NO: 209 Protei 



Protein Accession #: XP_059761.1 

1 11 21 31 41 51 

I I I I I I 

MALMEVNLLS GFMVPSEAIS LSETVKKVEY DHGKLNLYLD SWJETQFCVN IPAVRNFKVS 

NTQDASVSIV DYYEPRRQAV RSYNSEVKLS SCDLCSDVQG CRPCEDGASG SHHHSSVIFI 
FCFKLLYFME LWL 



Seq ID NO: 210 DMA sequence 
Nucleic Acid Accession #: NM_015472 

Coding sequence: 258-14S0 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GACACACTCC TCTACAACAC CAGAGACTCC CAAACACAAG GCCTTATATT GACTCATTTC 60 

AGCTCACATC CTGGCGACTC TCAAGAGAGA AACCTCAGAG TGACTAAAAT CTCCATAATG 120 

AGAAGACATG TACATTCAGT ATCTATTTTG GCATTTTCCC CAATACATCT CTGCTCATCT 180 

GACTCTTATC TTGGCATCTG CTTCCTGGTG GATCTGAACT GACCCATAAG CCACGCTTAC 24 0 

TGGTGATTTT CCAGAAGATG AATCCGGCCT CGGCGCCCCC TCCGCTCCCG CCGCCTGGGC 300 

AGCAAGTGAT CCACGTCACG CAGGACCTAG ACACAGACCT CGAAGCCCTC TTCAACTCTG 360 

TCATGAATCC GAAGCCTAGC TCGTGGCGGA AGAAGATCCT GCCGGAGTCT TTCTTTAAGG 42 0 

AGCCTGATTC GGGCTCGCAC TCGCGCCAGT CCAGCACCGA CTCGTCGGGC GGCCACCCGG 480 

GGCCTCGACT GGCTGGGGGT GCCCAGCATG TCCGCTCGCA CTCGTCGCCC GCGTCCCTGC 54 0 

AGCTGGGCAC CGGCGCGGGT GCTGCGGGTA GCCCCGCGCA GCAGCACGCG CACCTCCGCC 60 0 

AGCAGTCCTA CGACGTGACC GACGAGCTGC CACTGCCCCC GGGCTGGGAG ATGACCTTCA 660 

CGGCCACTGG CCAGAGGTAC TTCCTCAATC ACATAGAAAA AATCACCACA TGGCAAGACC 72 0 

CTAGGAAGGC GATGAATCAG CCTCTGAATC ATATGAACCT CCACCCTGCC GTCAGTTCCA 78 0 

CACCAGTGCC TCAGAGGTCC ATGGCAGTAT CCCAGCCAAA TCTCGTGATG AATCACCAAC 840 

ACCAGCAGCA GATGGCCCCC AGTACCCTGA GCCAGCAGAA CCACCCCACT CAGAACCCAC 900 

CCGCAGGGCT CATGAGTATG CCCAATGCGC TGACCACTCA GCAGCAGCAG CAGCAGAAAC 960 

TGCGGCTTCA GAGAATCCAG ATGGAGAGAG AAAGGATTCG AATGCGCCAA GAGGAGCTCA 1020 

TGAGGCAGGA AGCTGCCCTC TGTCGACAGC TCCCCATGGA AGCTGAGACT CTTGCCCCAG 1080 

TTCAGGCTGC TGTCAACCCA CCCACGATGA CCCCAGACAT GAGATCCATC ACTAATAATA 114 0 

GCTCAGATCC TTTCCTCAAT GGAGGGCCAT ATCATTCGAG GGAGCAGAGC ACTGACAGTG 1200 

GCCTGGGGTT AGGGTGCTAC AGTGTCCCCA CAACTCCGGA GGACTTCCTC AGCAATGTGG 1260 

ATGAGATGGA TACAGGAGAA AACGCAGGAC AAACACCCAT GAACATCAAT CCCCAACAGA 1320 

CCCGTTTCCC TGATTTCCTT GACTGTCTTC CAGGAACAAA CGTTGACTTA GGAACTTTGG 138 0 

AATCTGAAGA CCTGATCCCC CTCTTCAATG ATGTAGAGTC TGCTCTGAAC AAAAGTGAGC 144 0 

CCTTTCTAAC CTGGCTGTAA TCACTACCAT TGTAACTTGG ATGTAGCCAT GACCTTACAT 1500 

TTCCTGGGCC TCTTGGAAAA AGTGATGGAG CAGAGCAAGT CTGCAGGTGC ACCACTTCCC 1560 

GCCTCCATGA CTCGTGCTCC CTCCTTTTTA TGTTGCCAGT TTAATCATTG CCTGGTTTTG 1620 
ATTGAGAGTA ACTTAAGTTA AACATAAATA AATATTCTAT TTTCATTTTC 

Seq ID NO: 211 Protein sequence: 
Protein Accession #: NP_056287.1 

1 11 21 31 41 51 

I I I I I I 
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MNPASAPPPL PPPGQQVIHV TQDLDTDLEA LFNSVMNPKP SSWRKKILPE SFFKEPDSGS 60 

HSRQSSTDSS GGHPGPRLAG GAQHVRSHSS PASLQLGTGA GAAGSPAQQH AHLRQQSYDV 12 0 

TDELPLPPGW EMTFTATGQR .YFLNHIEKIT TWQDPRKAMN QPUJHMNLHP AVSSTPVP'QR 180 

SMAVSQPNLV MNHQHQQQMA PSTLSQQNHP TQNPPAGLM3 MPNALTTQQQ QQQKLRLQRI 24 0 

5 QMERERIRMR QEELMRQEAA LCRQLPMEAE TLAPVQAAVN PPTHTPDMRS ITNNSSDPFL 300 

NGGPYHSREQ STDSGLGLGC YSVPTTPEDF LSNVDEMDTG ENAGQTPMNI NPQQTRFPDF 360 
LDCLPGTNVD LGTLESEDLI PLENDVESAL NKSEPFLTWL 

Seq ID NO; 212 DKTA sequence 
10 Nucleic Acid Accession ft: NM_018174 

Coding sequence: 176-2194 (underlined sequences correspond to start and stop codons) 



CATCTCCCCC 
GCGCGGCGAG 
TCTGCCACTC 



CGTGCTGTTC 
GCACTTGAGG 
AGCCGAGAGC 
CACCCACCCT 
TGAGGCCCCA 
CAAACCGAGT 



AGCCGCGGCC 
CTGGACATGT 
TGCGCCCTGC 
CCCGGTTGCA 
TTCCTGCGAG 
AAAGAGAGCG 
AGACCTGGCC 
CGCAAGACTG 
GTCTCCCGGA 



TCGTGTTCTT 
AGCTGGCGCT 
CCGTGCCAGC 
ATGTGCTGCA 
TGGTGTGGCA 
CCCCGCCCGC 
AGCCCGTGGT 
TGGGCTCCCG 
AGGAGCGCCC 
AGAAAGAAGC 



Z GCCGGCGCCG AGCGCACGCT 24 0 



CCACTCTGGC 
AGAAGCCAGC 
CAGCCTGGAG 
CGCCAGCTCA 
GGGCAGCGAG 



TTCCCGCCGG 
CCCCCCAGTG 
CTGGGGCCGA 
ATCCCAAGGC 
CGGCTGTCGC 
ACCACACCCA 



CGCCCCCACC 
GGCTTCCCCA 
GGCGGTGCCA 
GTCACAGGAA 
GTCCCTGCCC 
AGACGAAGAC 
GGTCCCCCCA 
CCCCAAGACA 
CAACTCACGC 
TGCTGGTGGG 
AGGCCGGGCA 
GGGGTCAGCC 
GGACCTGGCC 
GCGCGTGCGC 
GCGGGCCGTC 
GACCCTGATC 
CCGGCACCAG 
TGACGCCTTC 
AGCCCAGCCC 



AGTGAGGCTG 
CACGATGTGG 
ATGGCACCGG 



CCCAGGCGGC 
TGGCAAATGG 
CAGCCTGCGG 
TCCCAGCCGG 
CACGCACACC 
TGAGCCCACT 
CGGTGACCAC 
AGTCCCTGTC 
GGCTGAGCCT 
ACCTGTGCCT 
CACCTGCGTC 



CAACGCCTGC G 
GAGCCTCCTG G 
CAAACCCACC G 
CCCGCCCTCC G 
CCCCGCCGGC CCCGGCGAGA AGGTGGTGCG 
CTGCCTCCTG GACGGCCTGG TCCGCCTGCA 
GACGCCCCAG GACCT3GAGG GGCCGGGGCG 
GGACAGCTCG AAGAGAGAGG GCCTCCTGGC 
TGGGGTGGCC CGCAAGGAGC CAGCACGGGC 
AGAAAGACCC 
CTTCTGTGCC 
CCAGCACGTC 
TCCGATGTGG 
TGGCCACGCC 
TGCCTTTGGC 
GCCCCGCAGA 
CAGACGCCTC 
TGGGCTCCCC 
TGCCGCCATC 



GGAGGTGCGC 
ACCCAAGCCC 
ACCCCGCAGC 
CTCTCCGGCC 
GGAGGAGAAG 
CTCCCCTGAG 



GCCCTCACTA 
GGTGTCCTTT 
CCCGCTGCGT 
GGTGTCACCC 



CGCAAAGCGC 
CCGCCCAGCC 
TCCCAGCTGG 
GCACTGGAGC 
TCCCACCGGA 
GAGGCCGGGC 



GAGCAGGTGC 



ACCCTGTCTG 



CCACTGCCTG 
GCACGGCAAA 
GCTGCCGCCC 
GACCGTGCCA 
CCCCTGTCCA 
AGCAGCCGGC 
TACCTGCCCA 
GCGCTCTGCT 
CTGGACGCGC 
CCCACTTTCG 
GCGCTGGGCA 
CCGGCCTGCA 
GCCTGTCCCT 



ACTCGGATCC 
TTGGAGTCCC 
ACCCATCCAG 
CGGAGAACGT 
CCAAAGCCAC 
GCCGACCACT 
GAAAGTCCTC 
CCGGGGTGTC 
GCGGGAGCAG 
ACGTCATCAG 
TACTGGCCAG 



CGTGCCCCTG 
TCGCCACGAC 
CATCTGCATG 
CAGCCGCACC 
TCCAGTGGCT 



TGTGAATTTG 
TCGAATGACA 
CCACCCACAT 
GCCCCCGGTG 
CCTTTGCCTG 
GTGGACCCCG 
CGGAAGCCCC 
GCTGCCAAAA 



AGCATCGCAA 



AACCCCCAAG 
AGCCACCCCA 
CGCCCACCTG 
TGGCCAGGAC 
CAAGCAGCAT 



TCACGGTGTT 
AGGTGGAGTT 
AGATTCAGCC 



GGGCAGCAAC 
CTAGCCCCAT 
ACATCAGAAA 



Seq ID NO: 213 Protein 



Protein Accession ft: NP_060644.1 



MGVGRLDMYV 
LQHLRFLREP 
RAEAPRKTEK 
TSHSGFPPVA 
LAASSIPRPR 
SPHSTEVDES 
RKAVPMAPAP 
DSDEDTEGFG 
RPNSRAAAPK 



LHPPSAGAER TLASVCALLV 
WTPQDLEGP GRAESKESVG 
EAKTPRELKK DPKPSVSRTQ 
NGPRSPPSLR CGEASPPSAA 



LSVSFEQVLP PSAPTSEAGL 
ASPGSSNDSS ARSQERAGGL 
VPRHDPLPDP LKVPPPLPDP 
ATPVAAAKTK GLAGGDRASR- 
VSATPPKSPV YLDLAYLPSG 
ASKQHWDRDL QVTLIPTFDS 



WHPAGPGEKV 
SRDSSKREGL 
PREVRRAASS 
CGSPASQLVA 
PLRGGEAGPD 
SLPLRGPRAR 
GAEETPPTSV 



PLSARSEPSE 
SSAHLVDEEF 
VAMHTWYAET 



GGCATGGTGT 
CGCCGACACG 
TAAACTGTGA 



VRVIjFPGCTP 
LATHPRPGQE 
VPNLKKTNAQ 
TPSLELGPIP 
ASPTVTTPTV 
RSASPHDVDL 
SESIiPTLSDS 
LPPKTARQTE 
KGGRAPLSRK 
FQRVRALCYV 
HARHQALGIT 



1260 
1320 
1380 



ACCCCCTCAA 
AGATGCTGCC 
TGGCCCGCCC 
CCAAGGGGCT 
GTGAGAAGGG 
GAGGCCCGTC 
CGGTCTACCT 
AGTTCTTCCA 
AGGAAGGCAT 
ACCTGCAGGT 
AGACGCACGC 
CCATGCAGGA 
CCCCCCACTC 
CTACACTTG 



RPGVARKEPA 
AAPKPRKAPS 
AGEEKALELP 
TTPSLPAEVG 
CLVSPCEFEH 
DPVPLAPGAA 
NVSRTRKPLA 
SSTPKTATRG 
ISGQDQRKEE 
VLGSNGMVSM 



) start and stop codons) 
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GGCTGGAGCC GCGAGACGGG CGCTCAGGGC GCGGGGCCGG CGGCGGCGAA CGAGAGGACG 180 

GACTCTGGCG GCCGGGTCGT TGGCCGGGGG AGCGCGGGCA CCGGGCGAGC AGGCCGCGTC 24 0 

GCGCTCACC A TG GTCAGCTA CTGGGACACC GGGGTCCTGC TGTGCGCGCT GCTCAGCTGT 30 0 

CTGCTTCTCA CAGGATCTAG TTCAGGTTCA AAATTAAAAG ATCCTGAACT GAGTTTAAAA 360 

5 GGCACCCAGC ACATCATGCA AGCAGGCCAG ACACTGCATC TCCAATGCAG GGGGGAAGCA 42 0 

GCCCATAAAT GGTCTTTGCC TGAAATGGTG AGTAAGGAAA GCGAAAGGCT GAGCATAACT 480 

AAATCTGCCT GTGGAAGAAA TGGCAAACAA TTCTGCAGTA CTTTAACCTT GAACACAGCT 540 

CAAGCAAACC ACACTGGCTT CTACAGCTGC AAATATCTAG CTGTACCTAC TTCAAAGAAG S00 

AAGGAAACAG AATCTGCAAT CTATATATTT ATTAGTGATA CAGGTAGACC TTTCGTAGAG 66 0 

10 ATGTACAGTG AAATCCCCGA AATTATACAC ATGACTGAAG GAAGGGAGCT CGTCATTCCC 720 

TGCCGGGTTA CGTCACCTAA CATCACTGTT ACTTTAAAAA AGTTTCCACT TGACACTTTG 78 0 

ATCCCTGATG GAAAACGCAT AATCTGGGAC AGTAGAAAGG GCTTCATCAT ATCAAATGCA 840 

ACGTACAAAG AAATAGGGCT TCTGACCTGT GAAGCAACAG TCAATGGGCA TTTGTATAAG 900 

ACAAACTATC TCACACATCG ACAAACCAAT ACAATCATAG ATGTCCAAAT AAGCACACCA 960 

15 CGCCCAGTCA AATTACTTAG AGGCCATACT CTTGTCCTCA ATTGTACTGC TACCACTCCC 1020 

TTGAACACGA GAGTTCAAAT GACCTGGAGT TACCCTGATG AAAAAAATAA GAGAGCTTCC 1080 

GTAAGGCGAC GAATTGACCA AAGCAATTCC CATGCCAACA TATTCTACAG TGTTCTTACT 114 0 

ATTGACAAAA TGCAGAACAA AGACAAAGGA CTTTATACTT GTCGTGTAAG GAGTGGACCA 120 0 

TCATTCAAAT CTGTTAACAC CTCAGTGCAT ATATATGATA AAGCATTCAT CACTGTGAAA 1260 

20 CATCGAAAAC AGCAGGTGCT TGAAACCGTA GCTGGCAAGC GGTCTTACCG GCTCTCTATG 132 0 

AAAGTGAAGG CATTTCCCTC GCCGGAAGTT GTATGGTTAA AAGATGGGTT ACCTGCGACT 138 0 

GAGAAATCTG CTCGCTATTT GACTCGTGGC TACTCGTTAA TTATCAAGGA CGTAACTGAA 1440 

GAGGATGCAG GGAATTATAC AATCTTGCTG AGCATAAAAC AGTCAAATGT GTTTAAAAAC 1500 

CTCACTGCCA CTCTAATTGT CAATGTGAAA CCCCAGATTT ACGAAAAGGC CGTGTCATCG 1560 

25 TTTCCAGACC CGGCTCTCTA CCCACTGGGC AGCAGACAAA TCCTGACTTG TACCGCATAT 1620 

GGTATCCCTC AACCTACAAT CAAGTGGTTC TGGCACCCCT GTAACCATAA TCATTCCGAA 1680 

GCAAGGTGTG ACTTTTGTTC CAATAATGAA GAGTCCTTTA TCCTGGATGC TGACAGCAAC 174 0 

ATGGGAAACA GAATTGAGAG CATCACTCAG CGCATGGCAA TAATAGAAGG AAAGAATAAG 180 0 

ATGGCTAGCA CCTTGGTTGT GGCTGACTCT AGAATTTCTG GAATCTACAT TTGCATAGCT 1860 

30 TCCAATAAAG TTGGGACTGT GGGAAGAAAC ATAAGCTTTT ATATCACAGA TGTGCCAAAT 192 0 

GGGTTTCATG TTAACTTGGA AAAAATGCCG ACGGAAGGAG AGGACCTGAA ACTGTCTTGC 198 0 

ACAGTTAACA AGTTCTTATA CAGAGACGTT ACTTGGATTT TACTGCGGAC AGTTAATAAC 204 0 

AGAACAATGC ACTACAGTAT TAGCAAGCAA AAAATGGCCA TCACTAAGGA GCACTCCATC 2100 

ACTCTTAATC TTACCATCAT GAATGTTTCC CTGCAAGATT CAGGCACCTA TGCCTGCAGA 2160 

35 GCCAGGAATG TATACACAGG GGAAGAAATC CTCCAGAAGA AAGAAATTAC AATCAGAGAT 2220 

CAGGAAGCAC CATACCTCCT GCGAAACCTC AGTGATCACA CAGTGGCCAT CAGCAGTTCC 228 0 

ACCACTTTAG ACTGTCATGC TAATGGTGTC CCCGAGCCTC AGATCACTTG GTTTAAAAAC 234 0 

AACCACAAAA TACAACAAGA GCCTGGAATT ATTTTAGGAC CAGGAAGCAG CACGCTGTTT 2400 

ATTGAAAGAG TCACAGAAGA GGATGAAGGT GTCTATCACT GCAAAGCCAC CAACCAGAAG 2460 

40 GGCTCTGTGG AAAGTTCAGC ATACCTCACT GTTCAAGGAA CCTCGGACAA GTCTAATCTG 2520 

GAGCTGATCA CTCTAACATG CACCTGTGTG GCTGCGACTC TCTTCTGGCT CCTATTAACC 2580 

CTCCTTATCC GAAAAATGAA AAGGTCTTCT TCTGAAATAA AGACTGACTA CCTATCAATT 264 0 

ATAATGGACC CAGATGAAGT TCCTTTGGAT GAGCAGTGTG AGCGGCTCCC TTATGATGCC 2700 

AGCAAGTGGG AGTTTGCCCG GGAGAGACTT AAACTGGGCA AATCACTTGG AAGAGGGGCT 276 0 

45 TTTGGAAAAG TGGTTCAAGC ATCAGCATTT GGCATTAAGA AATCACCTAC GTGCCGGACT 282 0 

GTGGCTGTGA AAATGCTGAA AGAGGGGGCC ACGGCCAGCG AGTACAAAGC TCTGATGACT 288 0 

GAGCTAAAAA TCTTGACCCA CATTGGCCAC CATCTGAACG TGGTTAACCT GCTGGGAGCC 2940 

TGCACCAAGC AAGGAGGGCC TCTGATGGTG ATTGTTGAAT ACTGCAAATA TGGAAATCTC 3000 

TCCAACTACC TCAAGAGCAA ACGTGACTTA TTTTTTCTCA ACAAGGATGC AGCACTACAC 3060 

50 ATGGAGCCTA AGAAAGAAAA AATGGAGCCA GGCCTGGAAC AAGGCAAGAA ACCAAGACTA 312 0 

GATAGCGTCA CCAGCAGCGA AAGCTTTGCG AGCTCCGGCT TTCAGGAAGA TAAAAGTCTG 3180 

AGTGATGTTG AGGAAGAGGA GGATTCTGAC GGTTTCTACA AGGAGCCCAT CACTATGGAA 324 0 

GATCTGATTT CTTACAGTTT TCAAGTGGCC AGAGGCATGG AGTTCCTGTC TTCCAGAAAG 330 0 

TGCATTCATC GGGACCTGGC AGCGAGAAAC ATTCTTTTAT CTGAGAACAA CGTGGTGAAG 3360 

55 ATTTGTGATT TTGGCCTTGC CCGGGATATT TATAAGAACC CCGATTATGT GAGAAAAGGA 3420 

GATACTCGAC TTCCTCTGAA ATGGATGGCT CCCGAATCTA TCTTTGACAA AATCTACAGC 348 0 

ACCAAGAGCG ACGTGTGGTC TTACGGAGTA TTGCTGTGGG AAATCTTCTC CTTAGGTGGG 3540 

TCTCCATACC CAGGAGTACA AATGGATGAG GACTTTTGCA GTCGCCTGAG GGAAGGCATG 3600 

AGGATGAGAG CTCCTGAGTA CTCTACTCCT GAAATCTATC AGATCATGCT GGACTGCTGG 366 0 

60 CACAGAGACC CAAAAGAAAG GCCAAGATTT GCAGAACTTG TGGAAAAACT AGGTGATTTG 3720 

CTTCAAGCAA ATGTACAACA GGATGGTAAA GACTACATCC CAATCAATGC CATACTGACA 378 0 

GGAAATAGTG GGTTTACATA CTCAACTCCT GCCTTCTCTG AGGACTTCTT CAAGGAAAGT 3840 

ATTTCAGCTC CGAAGTTTAA TTCAGGAAGC TCTGATGATG TCAGATATGT AAATGCTTTC 3900 

AAGTTCATGA GCCTGGAAAG AATCAAAACC TTTGAAGAAC TTTTACCGAA TGCCACCTCC 3960 

65 ATGTTTGATG ACTACCAGGG CGACAGCAGC ACTCTGTTGG CCTCTCCCAT GCTGAAGCGC 4020 

TTCACCTGGA CTGACAGCAA ACCCAAGGCC TCGCTCAAGA TTGACTTGAG AGTAACCAGT 4 080 

AAAAGTAAGG AGTCGGGGCT GTCTGATGTC AGCAGGCCCA GTTTCTGCCA TTCCAGCTGT 414 0 

GGGCACGTCA GCGAAGGCAA GCGCAGGTTC ACCTACGACC ACGCTGAGCT GGAAAGGAAA 420 0 

ATCGCGTGCT GCTCCCCGCC CCCAGACTAC AACTCGGTGG TCCTGTACTC CACCCCACCC 4260 

70 ATCTAGAGTT TGACACGAAG CCTTATTTCT AGAAGCACAT GTGTATTTAT ACCCCCAGGA 4320 

AACTAGCTTT TGCCAGTATT ATG CAT AT AT AAGTTTACAC CTTTATCTTT CCATGGGAGC 4380 

CAGCTGCTTT TTGTGATTTT TTTAATAGTG CTTTTTTTTT TTGACTAACA AGAATGTAAC 444 0 

TCCAGATAGA GAAATAGTGA CAAGTGAAGA ACACTACTGC TAAATCCTCA TGTTACTCAG 4500 

TGTTAGAGAA ATCCTTCCTA AACCCAATGA CTTCCCTGCT CCAACCCCCG CCACCTCAGG 4560 

75 GCACGCAGGA CCAGTTTGAT TGAGGAGCTG CACTGATCAC CCAATGCATC ACGTACCCCA 462 0 

CTGGGCCAGC CCTGCAGCCC AAAACCCAGG GCAACAAGCC CGTTAGCCCC AGGGGATCAC 468 0 
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TGGCTGGCCT GAGCAACATC TCGGGAGTCC TCTAGCAGGC CTAAGACATG TGAGGAGGAA 4740 

AAGGAAAAAA AGCAAAAAGC AAGGGAGAAA AGAGAAACCG GGAGAAGGCA TGAGAAAGAA 4800 

TTTGAGACGC ACCATGTGGG CACGGAGGGG GACGGGGCTC AGCAATGCCA TTTCAGTGGC 4860 

TTCCCAGCTC TGACCCTTCT ACATTTGAGG GCCCAGCCAG GAGCAGATGG ACAGCGATGA 4 92 0 

5 GGGGACATTT TCTGGATTCT GGGAGGCAAG AAAAGGACAA ATATCTTTTT TGGAACTAAA 4980 

GCAAATTTTA GACCTTTACC TATGGAAGTG GTTCTATGTC CATTCTCATT CGTGGCATGT 5040 

TTTGATTTGT AGCACTGAGG GTGGCACTCA ACTCTGAGCC CATACTTTTG GCTCCTCTAG 5100 

TAAGATGCAC TGAAAACTTA GCCAGAGTTA GGTTGTCTCC AGGCCATGAT GGCCTTACAC 5 ISO 

TGAAAATGTC ACATTCTATT TTGGGTATTA ATATATAGTC CAGACACTTA ACTCAATTTC 5220 

10 TTGGTATTAT TCTGTTTTGC ACAGTTAGTT GTGAAAGAAA GCTGAGAAGA ATGAAAATGC 5280 

AGTCCTGAGG AGAGTTTTCT CCATATCAAA ACGAGGGCTG ATGGAGGAAA AAGGTCAATA 534 0 

AGGTCAAGGG AAGACCCCGT CTCTATACCA ACCAAACCAA TTCACCAACA CAGTTGGGAC 5400 

CCAAAACACA GGAAGTCAGT CACGTTTCCT TTTCATTTAA TGGGGATTCC ACTATCTCAC 5450 

ACTAATCTGA AAGGATGTGG AAGAGCATTA GCTGGCGCAT ATTAAGCACT TTAAGCTCCT 5520 

15 TGAGTAAAAA GGTGGTATGT AATTTATGCA AGGTATTTCT CCAGTTGGGA CTCAGGATAT 5580 

TAGTTAATGA GCCATCACTA GAAGAAAAGC CCATTTTCAA CTGCTTTGAA ACTTGCCTGG 5S4 0 

GGTCTGAGCA TGATGGGAAT AGGGAGACAG GGTAGGAAAG GGCGCCTACT CTTCAGGGTC 5700 

TAAAGATCAA GTGGGCCTTG GATCGCTAAG CTGGCTCTGT TTGATGCTAT TTATGCAAGT 5760 

TAGGGTCTAT GTATTTAGGA TGCGCCTACT CTTCAGGGTC TAAAGATCAA GTGGGCCTTG 5 82 0 

20 GATCGCTAAG CTGGCTCTGT TTGATGCTAT TTATGCAAGT TAGGGTCTAT GTATTTAGGA 588 0 

TGTCTGCACC TTCTGCAGCC AGTCAGAAGC TGGAGAGGCA ACAGTGGATT GCTGCTTCTT 594 0 

GGGGAGAAGA GTATGCTTCC TTTTATCCAT GTAATTTAAC TGTAGAACCT GAGCTCTAAG 6000 

TAACCGAAGA ATGTATGCCT CTGTTCTTAT GTGCCACATC CTTGTTTAAA GGCTCTCTGT 6060 

ATGAAGAGAT GGGACCGTCA TCAGCACATT CCCTAGTGAG CCTACTGGCT CCTGGCAGCG 612 0 

25 GCTTTTGTGG AAGACTCACT AGCCAGAAGA GAGGAGTGGG ACAGTCCTCT CCACCAAGAT 6180 

CTAAATCCAA AC AAAAG C AG GCTAGAGCCA GAAGAGAGGA CAAATCTTTG TTGTTCCTCT 624 0 

TCTTTACACA TACGCAAACC ACCTGTGACA GCTGGCAATT TTATAAATCA GGTAACTGGA 6300 

AGGAGGTTAA ACTCAGAAAA AAGAAGACCT CAGTCAATTC TCTACTTTTT TTTTTTTTTT 636 0 

TCCAAATCAG ATAATAGCCC AGCAAATAGT GATAACAAAT AAAACCTTAG CTGTTCATGT 642 0 

30 CTTGATTTCA ATAATTAATT CTTAATCATT AAGAGACCAT AATAAATACT CCTTTTCAAG 648 0 

AGAAAAGCAA AACCATTAGA ATTGTTACTC AGCTCCTTCA AACTCAGGTT TGTAGCATAC 654 0 

ATGAGTCCAT CCATCAGTCA AAGAATGGTT CCATCTGGAG TCTTAATGTA GAAAGAAAAA 6600 

TGGAGACTTG TAATAATGAG CTAGTTACAA AGTGCTTGTT CATTAAAATA GCACTGAAAA 6660 

TTGAAACATG AATTAACTGA TAATATTCCA ATCATTTGCC ATTTATGACA AAAATGGTTG 6720 

35 GCACTAACAA AGAACGAGCA CTTCCTTTCA GAGTTTCTGA GATAATGTAC GTGGAACAGT 678 0 

CTGGGTGGAA TGGGGCTGAA ACCATGTGCA AGTCTGTGTC TTGTCAGTCC AAGAAGTGAC 684 0 

ACCGAGATGT TAATTTTAGG GACCCGTGCC TTGTTTCCTA GCCCACAAGA ATGCAAACAT 6900 

CAAACAGATA CTCGCTAGCC TCATTTAAAT TGATTAAAGG AGGAGTGCAT CTTTGGCCGA 6960 

CAGTGGTGTA ACTGTGTGTG TGTGTGTGTG TGTGTGTGTG TGTGTC-TGTG TGTGGGTGTG 702 0 

40 GGTGTATGTG TGTTTTGTGC ATAACTATTT AAGGAAACTG G AAT T T T AAA GTTACTTTTA 708 0 

TACAAACCAA GAATATATGC TACAGATATA AGACAGACAT GGTTTGGTCC TATATTTCTA 714 0 

GTCATGATGA ATGTATTTTG TATACCATCT TCATATAATA TACTTAAAAA TATTTCTTAA 720 0 

TTGGGATTTG TAATCGTACC AACTTAATTG ATAAACTTGG CAACTGCTTT TATGTTCTGT 7260 

CTCCTTCCAT AAATTTTTCA AAATACTAAT TCAACAAAGA AAAAGCTCTT TTTTTTCCTA 7320 

45 AAATAAACTC AAATTTATCC TTGTTTAGAG CAGAGAAAAA TTAAGAAAAA CTTTGAAATG 738 0 

GTCTCAAAAA ATTGCTAAAT ATTTTCAATG GAAAACTAAA TGTTAGTTTA GCTGATTGTA 7440 

TGGGGTTTTC GAACCTTTCA CTTTTTGTTT GTTTTACCTA TTTCACAACT GTGTAAATTG 7500 

CCAATAATTC CTGTCCATGA AAATGCAAAT TATCCAGTGT AGATATATTT GACCATCACC 7560 

CTATGGATAT TGGCTAGTTT TGCCTTTATT AAGCAAATTC ATTTCAGCCT GAATGTCTGC 7520 

50 CTATATATTC TCTGCTCTTT GTATTCTCCT TTGAACCCGT TAAAACATCC TGTGGCACTC 

Seq ID NO: 215 Protein sequence: 
Protein Accession #: NP_002010.1 

55 

1 11 21 31 41 51 
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MVSYWDTGVL LCALLSCLLL TGSSSGSKLK DPELSLKGTQ HIMQAC-QTLH LQCRGEAAHK 6 0 

WSLPEMVSKE SERLSITKSA CGRNGKQFCS TLTLNTAQAN HTGFYSCKYL AVPTSKKKET 120 

60 ESAIYIFISD TGRPFVEMYS EIPEIIHMTE GRELVIPCRV TSPNITVTLK KFPLDTLIPD 180 

GKRIIWDSRK GFIISNATYK EIGLLTCEAT VNGHLYKTNY LTHRQTNTII DVQISTPRPV 240 

KIiLRGHTLVL NCTATTPLNT RVQMTWSYPD EKNKRASVRR RIDQSNSHAN IFYSVLTIDK 300 

MQNKDKGLYT CRVRSGPSFK SVNTSVHIYD KAFITVKHRK QQVLETVAGK RSYRLSMKVK 360 

AFPSPEWWL KDGLPATEKS ARYLTRGYSL IIKDVTEEDA GNYTILLSIK QSNVFKNBTA 42 0 

65 TLIVNVKPQI YEKAVSSFPD PALYPLGSRQ ILTCTAYGIP QPTIKWFWHP CNHNHSEARC 480 

DFCSNNEESF ILDADSNMGN RIESITQRMA IIEGKNKMAS TLWADSRIS GIYICIASNK 540 

VGTVGRNISF YITDVPNGFH VNLEKMPTEG EDLKLSCTVN KFLYRDVTWI LLRTVNNRTM 500 

HYSISKQKMA ITKEHSITLN LTIMNVSLQD SGTYACRARN VYTGEEILQK KEITIRDQEA 660 

PYLLRNLSDH TVAISSSTTL DCHANGVPEP QITWFKNNHX IQQEPGIILG PGSSTLFIER 720 

70 VTEEDEGVYH CKATNQKGSV ESSAYLTVQG TSDKSNLELI TLTCTCVAAT LFWLLLTLLI 780 

RKMKRSSSEI KTDYLSIIMD PDEVPLDEQC ERIiPYDASKW EFARERIiKLG KSLGRGAFGK 84 0 

WQASAFGIK KSPTCRTVAV KMLKEGATAS EYKALMTELK ILTHIGHHLN WNLLGACTK 90 0 

QGGPLMVIVE YCKYGNLSNY LKSKRDLFFL NKDAALKMEP KKEKMEPGL3 QGKKPRLDSV 960 

TSSESFASSG FQEDKSLSDV EEEEDSDGFY KEPITMEDLI SYSFQVARGM EFLSSRKCIH 1020 

75 RDLAARNILL SENNWKICD FGLARDIYKN PDYVRKGDTR LPLKWMAPES IFDKIYSTKS 1080 

DWSYGVLLW EIFSLGGSPY PGVQMDEDFC SRLREGMRMR APEYSTPEIY QIMLDCWHRD 114 0 
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PKERPRFAEL VEKLGDLLQA NVQQDGKDYI PINAILTGNS GFTYSTPAFS EDFFKESISA 1200 

PKFNSGSSDD VRYVNAFKFM SLERIKTFEE LLPNATSMFD DYQGDSSTLL ASPMLKRFTW 12 SO 

TDSKPKASLK IDLRVTSKSK ESGLSDVSRP SFCHSSCGHV SEGKRRFTYD HAELERKIAC 132 0 
CSPPPDYNSV VLYSTPPI 

Seg ID NO: 216 DNA secmence 

Nucleic Acid Accession #: NM_024689 

Coding sequence: 7S-624 (underlined sequences correspond to start and stop codons) 



CTCTTTGGCC 
AGCCCAGAGC 
TGGCTGGCCC 
TCCCTTTCTT 
GGTCTTGATA 
GAAATAAGAT 
TCTTATCCTG 
CTGGTCAGCA 



I 



AGTAATCATT 



GACAACTCCA 
TGAGCCCACC 
CTGGACAGGT 
TTACCAAAGG 
CCCTAGTGAT 
CTTTTACTTT 
TTTATGTTTC 
TTTTTTGAGC 
GGGGAAGCTA 
TAGATTTCGG 
GTGTATCAGG 
AGATTTATTT 
TGGTAGGCCT 
ATTCCTTGTT 



CCAAGATGGA 
TGCTGCTGTG 
CTCTGGTGCC 
AATGCAATGC 
CTGACAACTG 
CAAATTACTC 
AATATCAAAA 
GCAGCATTGA 
GCCTCACGCC 
GTATGCTGAG 
GAAAGTGTTA 
AAAGAGGCTT 
AACACACAGG 



21 
I 

" TCTGTACAGC 
GCCCCAGCTG 
GGTCTCAGCC 
CCAAGTCAGA 
CTGCATCGGG 
GCTGGCTTCC 



31 
I 

CTCGAGTGGA 
GGGCCTGAGG 
CTGAGCTGTT 
ACCAGCTACA 
ACATCTATTT 
CACCTTGGAC 
AAAATCTGGC 
GACAGGAAAA 



I 

CAGCCAGAGG 
CTGCCGCCCT 
CTTTCTCCTT 
ATTTTGGAAG 
GCAAGAAGTT 
TGCCTCCCGA 
GCCCTGTGGA 
TCTGTGCCTC 
AGAGGTTCCA 



CGAGATCTCA 
GCGTGTCCTG 

GGACCTGGTG CAGGACTGTC ACCAGGGCCA 
ATAACACCAG 
GCCTTCTCCC 
CTGCATCAAA 



CCAATCCAGA 
CTTCCCTATG 
CCCACATAAC 
GGATGAACTC 
TTAAGTACCT 
CATTCTGGTA 
AAACACTCGG 
CTTCTCCTCT 
TCTTCATTGC 



TCACATCTAT 
GAATACAGCC 
TTTTATCGAC 
AGCATCACCC 
CAATAGAGAC 
AGGATGAGGA 
TTTTTATTTT 
AAGTGATTTA 
ATGAAACCTT 
TAATTGTGGG 
TGTAAGTGGG 
GTCTCTTAAG 
TAGCAGTTAT 



CAAAGAATTG 
GCAATCTGTA 
CCTTAAAAAA 
ATTACCTAGA 
GAAATGCCTT 
TAGCCAAGTT 
CGTCTTCAGA 
TGAACCAAAG 
ATTGACTGGT 
AACCAGGCAC 
TCACTGGGAG 



TCTTTCCACC 
TGTGATCCCC 
AGATAGAACC 
TTCTAAATTT 
TCTTATCTCT 
CCATCAGAGT 
AGAACTCTTT 
GGGTATGGAT 
ATTCAGATTT 
TGTCCATTTT 
CCAGAGAAAC 
ATTTACATGA 
AACCTTTGGG 
ATCTGTTTTT 
TAATCTCTTT 
CACAGCAACA 
GACACATAAA 
CTGTTAAGGA 



AACTGTGTTA 
CCTTCACATG 
ACAGGCTACC 
ACACTGTGGT 
CTGTGATCTA 
TGATAAGGAG 



TGGCATGGAC- 
TACCAACCAC 
CAGCTCCCAT 
TTTGCACAAT 
CCCCTAGATG 



GGCTTCTAGA 
CATGAAATCA 
AAATGAGGAC 
GAAAGCCAAT 
CACTAAAATC 
TGTAATGAAC 
AGAACCAGTA 
TTGTGGGGAT 
CAGGAGCTGA 
GTTCTTAAGG 
TACTTAAAGT 
CCTAGATTAG 
ATTAACCATC 
TTGTGGTAGA 
AGTAACAATA 



CTAGGGCAAT 
TGAAGGAAAA 
GGGAAAAAAA 
CCTG'TCAAGA 
ACTGCTGATT 



TTCCAAGATG 
GAGTGTTTTT 
GGAGATACAT 



TGCTGTAGTT 
GCTTTGAATG 
AAACTGATTG 
CATTCAATTG 
ACAGCAACAT 
GAACTGTGAC 
TAGCCAAGCA 



I 

CTGCAGCTGG 
CCGCCCTGGC 
GCCAGCTTCT 
GACTTTCCTC 
CTTTAAAGAA 
TTCCTTGCTT 
GATCTTTAGA 
TGCATCAGCC 
GAAATGGCTG 
GAGAGAACTA 
CCCAGCACTG 
ATTTTCAAAT 
GCCACCCTCC 
ATTCTCTGAT 
GGGCTGTTGC 
GATTCTCAGT 
TCTGACAACA 
CCTGTTCTCG 
AAGCATACCT 
CTATCTATCT 
GCTGGTGATG 
ATGACTCCTT 
AAAGCAAATC 
CCTTTAGCTA 
ATACATGTCC 
AAAATCCATA 
TGCAGATAGA 
ATTGGATCAG 
TAGGTGCTAA 



ATTTTATTGG 
TTTGCACATG 
ACCTTGGGGA 
CCTGTTCCTG 
CTATTGTTGT 



TGGCTGGCTT 
GGGAGAGGGA 
GTGGTAAGTT 
AGCAATTGGT 
AAATGGTTGG 



GGAGCTGCCT 
CTACAGTTAT 
GTATTAAACC 



GGGGCAACAG 
AGCCCTGTTG 
AATCCCCAAA 
TTTCCCAATT 
GCCAAACAGA 
CCATCATAAA 
AGAGGT TGCT 
ATTTCAACAC 



CTGTTTGCCA 
AGTAGACAAA 
GGGAGGGGTT 
CAGAACCTTC 



GCCTGCTAAA 
AGCCACTCTC 
AAATTCCAGT 
AAGAAATGCA 
TTTGGTTATC 
CAGATGGGAG 
CTGGTTTGAA 
TTGGCCAATA 
GGTCTCACTC 
AGTAAAAATG 
TCAGCAAATA 



1020 
1080 
1140 
1200 
1260 



1440 
1500 

1560 



2100 
2160 
2220 
2280 



i i i r i i 

MEPQLGPEAA ALRPGWLALL LWVSALSCSF SLPASSLSSL VPQVRTSYNF GRTFLGLDKC 
NACIGTSICK KFFKEEIRSD NWIiASHLGLP PDSLLSYPAK YSDDSKIWRP VEIFRLVSKY 
QNEISDRKIC ASASAPKTCS IERVLRKTER FQKWLQAKRL TPDLVQDCHQ GQRELKFLCM 



rrespond to start and stop codons) 



21 



I I I I I I 

GATTAATTAA GTGCTTTAAA CGGTCTTGGT AAATATTCCG CGGGAGCTGG GGAGGACCGT 
TGGGATGGCT GTAGCTTGAG TTGAATTTTA ACTGTCCTCA TTCTGGGTTT TGTCGCTCTG 
CTTTCTGTGC CAAGGTGCTG TGTTACGGGA GAGAGTGACT GGAAAGTAAC AAAGCTGAAT 
CTTTCTCCCT GGAGTAAGGC CGAAGACTGG ATTACTACAC GCCTAGACGT GACACTACAC 
CCATAGATCT CATGCATCAT TAATGCC ATA_TGACAT TGC C ATTTTCTTTC TCAGTTCACG 
GACAAAAGTG GTGGGTTTTC ATTGTCTTCA CTGATTGTCA ATGCATTAAT AAAGAAGATG 
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Seq ID NO: 219 Protein sequence 
Protein Accession #: AF075027 
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ERKWQCHMAL MMHEIYGCSV TSRRWIQSS ALLQGERFSF VTFQSLSPVT QHLGTESRAT 60 
KPRMRTVKIQ LKLQPSQRSS PAPAEYLPRP FKALN 

Seq ID NO: 220 DNA sequence 

Nucleic Acid Accession #: AL133411.8 

Coding sequence: 1-1395 (underlined sequences correspond to start and stop c 



AACAGGCAAC 
ACATCCAGAA 
AAAAAGGACA 



11 
I 

ACTTCATGAC 
TAATCAAATT 
CTACAGAATG 
TCTATAAGGA 
TGGATGAAGC 



ATCTGGGTGG 
CTACCTTTAA 
TATGCTCCTG 
ACAACACAAA 
AATTTTGCTC 
GAAGAAGAAA 
GTGCCTGCAT 



AATCAGCCAG 
CTCCTCCTCT 
CTGAGTTATA 



CACAAACTCC 
ATGAAGATGT 
CTAATGAGAA 
ATCCAAATGG 
TAAAAAACGA 
CAACTACTAG 
TTTGGACAAT 
AATTATTTCA 
ATCTAGAGGA 



TAAAACACTA 
AAAGAGCTTC 
GGAGAAAAAT 
ACTTAAACAA 
TGGAAACCGT 
CACTCATAAG 
ACTGGGGCCT 
ATTACAATTC . 
AATACAAAAG 
TGGTGAATCT 



AAAGCAATGG 
CGCACAGCAA 
TTTGCAATGT 
TTTTACAAGA 
CATTCTCAGA 
TGGGAGTTGA 
GTCAGAAGCC 



TAAAACTGTC 
CGAACCCTCT 
GTTAGCTAAA 
CCCAATTCCA 
TCTGAAGATC 
CTTGGCATTC 



ACATTTTTGG 
AAGATAATGA 
TCGGTTACCC 



AAAGTTGTGA 
CATCAGAAGG 
GTACCACTTC 
CGGATATCAT 
GGTGA 



TGTT'i'CAGAT 
TTCAGATATG 
TTCCATAGGC 



AAATATGCAT 
ATCTCCAGTT 
GAAGAAAAGA 
TATTATAAAG 
GAAATATCTG 
AATGCAACTA 
CATAAAAATA 
GCTATAAATG 
GAGTCTGATG 
AAAATAATGC 
TGTAGTGCTA 
TCTGTCAACC 
ACATCCTTTT 
AGAAGATCAG 
TCAGATAATG 



41 

i 

CAACAAAAGC 
AAGAAACTAT 
ATCCATCTGA 
AAAAACCAAA 
AAACTAACAC 
ACAATGAGAA 
CCTCTGGCCT 
TGTTTACCCT 
TATGGATTGG 
CACCAGCATT 
ATGAAGATCA 
ATATAAAACA 
TGAGAGCCAC 
CATATGAAAA 
TTCAAftGATC 
GAACAGCAGT 



CAAAATTGAC 
TATCAGAGTG 
CAAAGGGCTG 
CAACGCCATC 
AGGAACAGAA 
CACATGGACA 
CCTGGCTGGC 
TGAAAATGAA 
AACCAAGCAG 
GCCTAATGTG 
TACTCCCAAT 
ATATGTGTTC 
AACTGACCTG 
ATCCACCATT 
AACCCCAAAC 
GGTCATGGAT 



CCAAGAGTGC 
GCACAAGAAC 
AGATGCATGA 



GTTGATGACC 
ACTGAGGCAT 
CACGATGTCT 



11 



21 



31 



51 



I I I I I I 

MGKDFMTKTL KAMATKAKID KWDLIKLKSF RTAKETIIRV NRQPTEWEKN FAMYPSDKGL 
TSRIYKELKQ FYKKKPNNAI KKDMDEAGNR HSQKTNTGTE NQTPHVLTHK WELNNENTWT 
QGGEHHTLGP VRSPSGLLAG LEHAGRKLQF IHGLFTLENE WAQEQSIIQK KYALWIGTKQ 
IWVAOTPGES ISSSPALPNV LPLNEDVNKQ EEKNEDHTPN YAPANEKNGN YYKDIKQYVF 
TTQNPNGTES EISVRATTDL NFALKNDKTV NATTYEKSTI EEETTTSEPS HKNIQRSTPN 
VPAFWTMLAK AINGTAWMD DKDQLFHPIP ESDVNATQGE NQPDLEDLKI KIMLGISLMT 
LLLFWLLAF CSATliYKLRH LSYKSCESQY SVNPELATMS YFHPSEGVSD TSFSKSAESS 
TFLGTTSSDM RRSGTRTSES KIMTDIISIG SDNEMHENDE SVTR 



222 DNA 



Coding sequence: 237-2073 (underlined 



sequences correspond to start and stop codons) 



I 

GAAGGGGACA 
AGCATTTGAA 
TGCCCTCGAG 
TTTTAGAAGC 
AATCCCCAAG 
CACTGAACTG 
CTGGTGAAGA 
AATACACTGT 
CCTACTTGAA 
TTTTGAGCAT 
GCGAGACAGG 
GTGAGGTCTT 



GAATTACGAG 
GGCACTGAGG 
TAATATTGAG 
CAGCCTCAGT 
AAATGTGACA 
T.TATGGGTGG 
CCTCCCAGGG 



21 
I 

r CACCTCTGCT 
r GTGAAACCCC 
i CCTGAGAGGG 
: CAGAAGAGAA 
TTGTGCCTCA 
TCTACTATTC 
CAAAAACGAG 
ATCAGTTTTG 
TTTCCAATTC 
ACAGTCTGCA 
CCTCGGGAAA 
CACCATTGCA 



ACCCTCCTCT 
TTTTGGGCAG 
AGGCCAACCA 
TGTTTATTGT 
ATCCTTTGAG 
CCGTTGCCAC 
AAAATGCATC 
ATGGGAATAA 
GACCTGCTGG 
GGTGTCTTCA 
GTTGCCTTAA 



GGCTGTGTGA 
ATCAGCAGTA 
ACTCAAACTT 
GATTTATTCT 
TCTTCATGAA 
AAAAAGTCCT 
CTTCCTGGAT 
CACTGACCAA 
AAATGAAATC 
CAATCTCATT 
AGAACTGCCT 



CAAGAGCCCC 60 

TTGAATGGGA 12 0 

AGGTGTTAAA 18 0 

GAAGACATGA 240 

TCCAAAGCTG 300 

CATGAACCAG 360 

ACGGCTGAAG 420 

CCTATCAAAG 480 

ATTACTGACA 540 

TGGTGCTCCT 600 

TGTCAAGAGC 660 

CCCAATGGAC 720 
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WO 02/079492 
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CTTTTTGCCT GCTTCAGGAA GATGTTACCC TGAACATGAG AGTCAGACTA AATGTAGGCT 780 

TTCAAGAAGA CCTCATGAAC ACTTCCTCCG CCCTCTATAG GTCCTACAAG ACCGACTTGG 84 0 

AAACAGCGTT CCGGAAGGGT TACGGAATTT TACCAGGCTT CAAGGGCGTG ACTGTGACAG 900 

GGTTCAAGTC TGGAAGTGTG GTTGTGACAT ATGAAGTCAA GACTACACCA CCATCACTTG 9S0 

AGTTAATACA TAAAGCCAAT GAACAAGTTG TACAGAGCCT CAATCAGACC TACAAAATGG 102 0 

ACTACAACTC CTTTCAAGCA GTTACTATCA ATGAAAGCAA TTTCTTTGTC ACACCAGAAA 108 0 

TCATCTTTGA AGGGGACACA GTCAGTCTGG TGTGTGAAAA GGAAGTTTTG TCCTCCAATG 1140 

TGTCTTGGCG CTATGAAGAA CAGCAGTTGG AAATCCAGAA CAGCAGCAGA TTCTCGATTT 1200 

ACACCGCACT TTTCAACAAC ATGACTTCGG TGTCCAAGCT CACCATCCAC AACATCACTC 1260 

CAGGTGATGC AGGTGAATAT GTTTGCAAAC TGATATTAGA CATTTTTGAA TATGAGTGCA 132 0 

AGAAGAAAAT AGATGTTATG CCCATCCAAA TTTTGGCAAA TGAAGAAATG AAGGTGATGT 1380 

GCGACAACAA TCCTGTATCT TTGAACTGCT GCAGTCAGGG TAATGTTAAT TGGAGCAAAG 144 0 

TAGAATGGAA GCAGGAAGGA AAAATAAATA TTCCAGGAAC CCCTGAGACA GACATAGATT 1500 

CTAGCTGCAG CAGATACACC CTCAAGGCTG ATGGAACCCA GTGCCCAAGC GGGTCGTCTG 1560 

GAACAACAGT CATCTACACT TGTGAGTTCA TCAGTGCCTA TGGAGCCAGA GGCAGTGCAA 162 0 

ACATAAAAGT GACATTCATC TCTGTGGCCA ATCTAACAAT AACCCCGGAC CCAATTTCTG 1S8 0 

TTTCTGAGGG ACAAAACTTT TCTATAAAAT GCATCAGTGA TGTGAGTAAC TATGATGAGG 1740 

TTTATTGGAA CACTTCTGCT GGAATTAAAA TATACCAAAG ATTTTATACC ACGAGGAGGT 1800 

ATCTTGATGG AGCAGAATCA GTACTGACAG TCAAGACCTC GACCAGGGAG TGGAATGGAA 1860 

CCTATCACTG CATATTTAGA TATAAGAATT CATACAGTAT TGCAACCAAA GACGTCATTG 1920 

TTCACCCGCT GCCTCTAAAG CTGAACATCA TGATTGATCC TTTGGAAGCT ACTGTTTCAT 1980 

GCAGTGGTTC CCATCACATC AAGTGCTGCA TAGAGGAGGA TGGAGACTAC AAAGTTACTT 2040 
TCCATATGGG TTCCTCATCC CTTCCTGCTG TAAAAAAAAA AAAAAAAAAA A 



1 11 21 31 41 51 

I I I I I I 

MKSPRRTTLC LMFIVIYSSK AALNWNYEST IHPLSLHEHE PAGEEALRQK RAVATKSPTA 60 

EEYTVNIEIS FENASFLDPI KAYLNSLSFP IHGNNTDQIT DILSINVTTV CRPAGNEIWC 12 0 

SCETGYGWPR ERCLHNLICQ ERDVFLPGHH CSCLKELPPN GPFCLLQEDV TLNMRVRLNV 18 0 

GFQEDLMMTS SALYRSYKTD LETAFRKGYG ILPGFKGVTV TGFKSGSVW TYEVKTTPPS 24 0 

LELIHKAKEQ WQSLNQTYK MDYNSFQAVT INESNFFVTP EIIFEGDTVS LVCEKEVLSS 300 

NVSWRYEEQQ LEIQNSSRFS IYTALFNNMT SVSKLTIHNI TPGDAGEYVC KLILDIFEYE 360 

CKKKIDVMEI QILANEEMKV MCDNNPVSLN CCSQGNVNWS KVEWKQEGKI NIPGTPETDI 420 

DSSCSRYTLK ADGTQCPSGS SGTTVIYTCE FISAYGARGS ANIKVTFISV ANLTITPDPI 480 

SVSEGQNFSI KCISDVSNYD EVYWNTSAGI KIYQRFYTTR RYIiDGAESVL TVKTSTREWN 54 0 

GTYHCIFRYK NSYSIATKDV IVHPLPDKLN IMIDPLEATV SCSGSHHIKC CIEEDGDYKV 600 
TFHMGSSSLP AVKKKKKK 

Seq ID NO: 224 DNA sequence 

Nucleic Acid Accession #: NM_007268 

Coding sequence: 46-1245 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

1 I I I I 1 

GGTAGCAGGA GGCTGGAAGA AAGGACAGAA GTAGCTCTGG CTGTC-ATGGG GATCTTACTG 60 

GGCCTGCTAC TCCTGGGGCA CCTAACAGTG GACACTTATG GCCGTCCCAT CCTGGAAGTG 12 0 

CCAGAGAGTG TAACAGGACC TTGGAAAGGG GATGTGAATC TTCCCTGCAC CTATGACCCC 180 

CTGCAAGGCT ACACCCAAGT CTTGGTGAAG TGGCTGGTAC AACGTGGCTC AGACCCTGTC 24 0 

ACCATCTTTC TACGTGACTC TTCTGGAGAC CATATCCAGC AGGCAAAGTA CCAGGGCCGC 30 0 

CTGCATGTGA GCCACAAGGT TCCAGGAGAT GTATCCCTCC AATTGAGCAC CCTGGAGATG 360 

GATGACCGGA GCCACTACAC GTGTGAAGTC ACCTGGCAGA CTCCTGATGG CAACCAAGTC 42 0 

GTGAGAGATA AGATTACTGA GCTCCGTGTC CAGAAACTCT CTGTCTCCAA GCCCACAGTG 48 0 

ACAACTGGCA GCGGTTATGG CTTCACGGTG CCCCAGGGAA TGAGGATTAG CCTTCAATGC 540 

CAGGCTCGGG GTTCTCCTCC CATCAGTTAT ATTTGGTATA AGCAACAGAC TAATAACCAG 600 

GAACCCATCA AAGTAGCAAC CCTAAGTACC TTACTCTTCA AGCCTGCGGT GATAGCCGAC 660 

TCAGGCTCCT ATTTCTGCAC TGCCAAGGGC CAGGTTGGCT CTGAGCAGCA CAGCGACATT 72 0 

GTGAAGTTTG TGGTCAAAGA CTCCTCAAAG CTACTCAAGA CCAAGACTGA GGCACCTACA 780 

ACCATGACAT ACCCCTTGAA AGCAACATCT ACAGTGAAGC AGTCCTGGGA CTGGACCACT 840 

GACATGGATG GCTACCTTGG AGAGACCAGT GCTGGGCCAG GAAAGAGCCT GCCTGTCTTT 900 

GCCATCATCC TCATCATCTC CTTGTGCTGT ATGGTGGTTT TTACCATGGC CTATATCATG 960 

CTCTGTCGGA AGACATCCCA ACAAGAGCAT GTCTACGAAG CAGCCAGGGC ACATGCCAGA 1020 

GAGGCCAACG ACTCTGGAGA AACCATGAGG GTGGCCATCI TCGCAAGTGG CTGCTCCAGT 108 0 

GATGAGCCAA CTTCCCAGAA TCTGGGCAAC AACTACTCTG ATGAGCCCTG CATAGGACAG 114 0 

GAGTACCAGA TCATCGCCCA GATCAATGGC AACTACGCCC GCCTGCTGGA CACAGTTCCT 1200 

CTGGATTATG AGTTTCTGGC CACTGAGGGC AAAAGTGTCT GTTAAAAATG CCCCATTAGG 1260 

CCAGGATCTG CTGACATAAT TGCCTAGTCA GTCCTTGCCT TCTGCATGGC CTTCTTCCCT 1320 

GCTACCTCTC TTCCTGGATA GCCCAAAGTG TCCGCCTACC AACACTGGAG CCGCTGGGAG 1380 

TCACTGGCTT TGCCCTGGAA TTTGCCAGAT GCATCTCAAG TAAGCCAGCT GCTGGATTTG 1440 

GCTCTGGGCC CTTCTAGTAT CTCTGCCGGG GGCTTCTGGT ACTCCTCTCT AAATACCAGA 1500 

GGGAAGATGC CCATAGCACT AGGACTTGGT CATCATGCCT ACAGACACTA TTCAACTTTG 1560 

GCATCTTGCC ACCAGAAGAC CCGAGGGAGG CTCAGCTCTG CCAGCTCAGA GGACCAGCTA 1620 

TATCCAGGAT CATTTCTCTT TCTTCAGGGC CAGACAGCTT TTAATTGAAA TTGTTATTTC 168 0 

ACAGGCCAGG GTTCAGTTCT GCTCCTCCAC TATAAGTCTA ATGTTCTGAC TCTCTCCTGG 1740 



284 
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PCT/US02/04915 



TGCTCAATAA ATATCTAATC ATAACAGCAA AAAAAAAAAA AAAAAAA 



11 



ti NP_009199.3 
21 



31 



41 



51 



I I I 

MGILLGLLLL GHLTVDTYGR PILEVPESVT GPWKGDVNLP CTYDPLQGYT QVLVKWLVQR 

GSDPVTIFLR DSSGDHIQQA KYQGRLHVSH KVPGDVSLQL STLEMDDRSH YTCEVTWQTP 

DGNQWRDKI TELRVQKLSV SKPTVTTGSG YGFTVPQGMR ISLQCQARGS PPISYIWYKQ 

QTNNQEPIKV ATLSTLLFKP AVIADSGSYF CTAKGQVGSE QHSDIVKFW KDSSKLLKTK 

TEAPTTMTYP LKATSTVKQS WDWTTDMDGY LGETSAGPGK SLPVFAIILI ISLCCMWFT 

MAYIMLCRKT SQQEHVYEAA RAHAREANDS GETMRVAIFA SGCSSDEPTS QNLGNNYSDE 
PCIGQEYQII AQINGNYARL LDTVPLDYEF LATEGKSVC 



1-2079 (underlined sequences correspond to s 



21 



ATG GTCGCCA 
CCAGGACCAC 
GAGCCTGGTA 
GGGGTTGGGA 
ATCCGTCAGG 
CCATACGAAG 
CTCACAAATG 
CATAATCAAT 
TTCCCCTCCA 
CCTGAAGTTA 
TCCCTTTGTG 
GAAAGCAGAA 
CGCAAGTCAC 
GTTCACACCA 
CAGCCAATAC 
GGTGATCTTG 
CGGAGGAGCC 
CTCAAGAGGT 
CCCACATTCA 
TCCCTGCTCC 
GGGGGGCGCC 
AGGCCAGGGA 
TCCTCTTTAG 
TGCTTAGAAG 
GGTCCAGACA 
CAGCACCAGA 
AATGAATATG 
TATTCCCCAA 
GTCTCCCTTT 
GTCGTAGCAG 
CAGAAGCAGC 



GTTCCGATCA 
GCCTCCGCTC 
CTTACAGGGA 
TCACGGGAAA 
AGGATGCCTT 
CTACCTTGCA 
GCTACCTGCC 
ATCCTAATGG 



I 

AGACAGAGCC 
TGCCCAGAGG 
GGGTGGTGGA 
CACAGTTCAA 
TGATAACAAA 



31 



AACTAAAAAT 
GAGACCTTTT 
AAGAAAAGAG 



TATCAGAAAA 
TATCTTCTGT 
TGTGGTCCAA 
ATCACTGTTC 
CTTTCATGGT 
AGGGAACTGC 
TTGGTCATTG 
GCTGGGTGTG 
AGGAACCTCT 
CCACCTCTCA 
ACTATGCAGG 
GCAGGACTGG 
GATATCTCCT 
ATTGCAAGCT 
CCCACATCCT 
CCAAGTTCCT 
TACGTAGGAG 



CCCCCGGCCG 
TGCTCCTGCT 
TTCATTCCAC 



ATCCCTGGGT 



ATCAATCAGC 
AAATTCTAAA 
TTCGGTACCA 
AACCAAAACT 
AAATGAAGTA 
GAAAAAACCC 
CAAATTAGAA 
ACCAAGGGAA 
TCCAACAACA 
GGTGACGGTC 
CAGCTGCCTG 
TTCTTCCTTG 
CCAGATGGGC 
GGAAGGAACA 
GCAGCATCAG 
GAGACTCAGT 
GGGCTGCTGG 
GCGCCGCCAT 
AAGCCTTAGA 
GCAAGGCCTC 
AGAGACGAGA 
TCAGTCTGAA 
CAAACGCAAA 
ATCCAATATG 
CAAGTACACG 
TCAGAAGAGG 
GGTGGCTGAC 
GTATCTGACT 
GCCCTTCGGC 



I 

' CCGTATCTTC 
CCAAAAGCAG 
GCCATCGTCC 
CAACCACCTC 
ATTGACATTG 
CAATACTCAC 
ATGTATGAAA 
CAGAAGACCA 
CAAACTGTGA 
AT C C AG AATG 
CAGGCAAGTG 
AAAAAGCATG 
CCAGAGGAAC 
GATCCAGTAC 
GAAGTGTCCA 
ACACCCTGTT 
GAGATCTTGG 
AAGTTCCTCA 
TGGTCACCTA 
GACCAGATGT 



I 

CAGGGACACT 
CCCAACAAGA 
TCACGTATGC 
AACTCACTGA 
CTGAAGATGG 
CTACAACAGA 
TTCAAACCAA 
CATTAAATTC 
TTCCAAAGAA 
GCAGGGAATT 
AGCACACGAA 
ACTCATCAAG 



TAAAAGAGGA 
CTGGTGTTAA 
GGGTGCCCCG 



TTCCTACGAT 
TGTTCGCCCC 
TTGACACTCA 
GCTGTCGGCA 
CTAGGTGGGT 



CCTCCACGGG 
TGGCCTCCAC 
CATCCAGGGG 
TCCGCATCTC 
GTGAAGTGGA 
CAGACCACGT 
GAGCCCAGGA 



AGTGCCCCTA 
GCAAACAGCC 
CCTGGCACAA 
CCTGCCTGCC 
GGCCTGTCAG 
AGACGCCAGC 
GGAGTTTCTG 
TCCTCCTAG 



CCTCAACTCC 
ACCACTACTT 
ATTTCCTGCA 
GGGGGTGGGG 
ACGCACAATG 
GAAGATGTAG 
AGCTGCCGGG 
GATTAAAGGC 



CCCGGAATTT 
CATCTGCCAC 
GAGAAGAATC 
CTGTGAGAAA 
AGCCTTTCTT 
CAGGAACAGC 
AAGGAATGCC 
AAGAATCCCG 
TCCCTACCAC 



TGGCCACAAA 
GGAGACA'TTC 
GGGCCAGCAG 



: and stop codons) 



AGACAAGATG 
GCCCGGCATT 
GCTGGGGATC 
CTCCGCCAGC 



TCTTCCTCCA 
ATACCAGTCG 
TAGAAAACCC 
GAGTGGCTCA 
GTTCAAGTCT 
GTCAAAGCAT 
ATCTGAAGAG 
AAATGAGAGG 
AGCCCCAGTT 



1500 
1560 
1620 
1680 
174 0 
1800 
1860 
1920 



XP_064321.1 



11 



21 



31 



MVASSDQDRA 
GVGITGNTVQ 
LTNGYLPSIS 
PEVKLKITKT 
RKSHKIPKLE 
GDLVWSKVTV 
PTFKGTAQMG 
RPGKEPLRLS 



I I I 

PYLPGTDDKM PGPRLRSAQR PKAAQQEPGI 
QPPQLTDSAS IRQEDAFDNK IDIAEDGGQT 
MYEIQTKYQS HNQYPNGNSK QKTTLNSRKP 
IQMGRELFKS SLCGDLLNEV CASEHTKSKH 
PEEQNRPNER VHTISEKPRE DPVLKEEAPV 
TPCWVPRLRG RRSHHCSSCL EILVLVPALS 
WSPMASTTNV SLLLGHWEGT DQMSSRGPEF 



I I 
EPGTYREGGG AIVLTYALGI 
PYEATLQQSF QYSPTTDLPP 
FPSTATTSVP QTVIPKKSGS 



YSPTHILQSE 
QKQPCPAKYT 
CSCSQDVYLT 



AVGKRYCRNS QHQRYLLQGL LGGFLEERNA 
SAPNHYFPYH VSLSKFLKRK ANSHFLHLCA 
PACHAQWETF RKFHVMAQKR GLSGRCRGQQ 
GVSGLKASRG FIPHPWVPFG SS 



QPILSSVPTT EVSTGVKFQV 
LKRSFMVSSL KFLTSTGKQK 
GGRRWVWQHQ KPQIRISICH 
CLEDYAGRRH LTLRAQEAFL 
NEYDCKLETR EAASSTPRIP 
WAVRRRSNM PGTRGWGGHK 
PPAAPRKVAD RRQQLPGAPG 
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Seq ID NO: 22 8 DNA sequence 
Nucleic Acid Accession #: 
Coding sequence: 253-1 




>4_006033 

{underlined sequences correspond to start and stop codons) 



1 



11 



21 



31 



41 



51 



AGCAGCGAGT CCTTGCCTCC CGGCGGCTCA GGACGAGGGC AGATCTCGTT CTGGGGCAAG 50 

CCGTTGACAC TCGCTCCCTG CCACCGCCCG GGCTCCGTGC CGCCAAGTTT TCATTTTCCA 120 

CCTTCTCTGC CTCCAGTCCC CCAGCCCCTG GCCGAGAGAA GGGTCTTACC GGCCGGGATT 18 0 

GCTGGAAACA CCAAGAGGTG GTTTTTGTTT TTTAAAACTT CTGTTTCTTG GGAGGGGGTG 24 0 

TGGCGGGGCA GGATGAGCAA CTCCGTTCCT CTGCTCTGTT TCTGGAGCCT CTGCTATTGC 300 

TTTGCTGCGG GGAGCCCCGT ACCTTTTGGT CCAGAGGGAC GGCTGGAAGA TAAGCTCCAC 3 60 

AAACCCAAAG CTACACAGAC TGAGGTCAAA CCATCTGTGA GGTTTAACCT CCGCACCTCC 42 0 

AAGGACCCAG AGCATGAAGG ATGCTACCTC TCCGTCGGCC ACAGCCAGCC CTTAGAAGAC 480 

TGCAGTTTCA ACATGACAGC TAAAACCTTT TTCATCATTC ACGGATGGAC GATGAGCGGT 54 0 

ATCTTTGAAA ACTGGCTGCA CAAACTCGTG TCAGCCCTGC ACACAAGAGA GAAAGACGCC 600 

AATGTAGTTG TGGTTGACTG GCTCCCCCTG GCCCACCAGC TTTACACGGA TGCGGTCAAT 660 

AATACCAGGG TGGTGGGACA CAGCATTGCC AGGATGCTCG ACTGGCTGCA GGAGAAGGAC 720 

GATTTTTCTC TCGGGAATGT CCACTTGATC GGCTACAGCC TCGGAGCGCA CGTGGCCGGG 780 

TATGCAGGCA ACTTCGTGAA AGGAACGGTG GGCCGAATCA CAGGTTTGGA TCCTGCCGGG 840 

CCCATGTTTG AAGGGGCCGA CATCCACAAG AGGCTCTCTC CGGACGATGC AGATTTTGTG 900 

GATGTCCTCC ACACCTACAC GCGTTCCTTC GGCTTGAGCA TTGGTATTCA GATGCCTGTG 96 0 

GGCCACATTG ACATCTACCC CAATGGGGGT GACTTCCAGC CAGGCTGTGG ACTCAACGAT 102 0 

GTCTTGGGAT CAATTGCATA TGGAACAATC ACAGAGGTGG TAAAATGTGA GCATGAGCGA 10 8 0 

GCCGTCCACC TCTTTGTTGA CTCTCTGGTG AATCAGGACA AGCCGAGTTT TGCCTTCCAG 114 0 

TGCACTGACT CCAATCGCTT CAAAAAGGGG ATCTGTCTGA GCTGCCGCAA GAACCGTTGT 12 00 

AATAGCATTG GCTACAATGC CAAGAAAATG AGGAACAAGA GGAACAGCAA AATGTACCTA 1260 

AAAACCCGGG CAGGCATGCC TTTCAGAGTT TACCATTATC AGATGAAAAT CCATGTCTTC 13 2 0 

AGTTACAAGA ACATGGGAGA AATTGAGCCC ACCTTTTACG TCACCCTTTA TGGCACTAAT 13 80 

GCAGATTCCC AGACTCTGCC ACTGGAAATA GTGGAGCGGA TCGAGCAGAA TGCCACCAAC 144 0 

ACCTTCCTGG TCTACACCGA GGAGGACTTG GGAGACCTCT TGAAGATCCA GCTCACCTGG 15 00 

GAGGGGGCCT CTCAGTCTTG GTACAACCTG TGGAAGGAGT TTCGCAGCTA CCTGTCTCAA 1560 

CCCCGCAACC CCGGACGGGA GCTGAATATC AGGCGCATCC GGGTGAAGTC TGGGGAAACC 1620 

CAGCGGAAAC TGACATTTTG TACAGAAGAC CCTGAGAACA CCAGCATATC CCCAGGCCGG 1680 

GAGCTCTGGT TTCGCAAGTG TCGGGATGGC TGGAGGATGA AAAACGAAAC CAGTCCCACT 1740 

GTGGAGCTTC CC TGA GGGTG CCCGGGCAAG TCTTGCCAGC AAGGCAGCAA GACTTCCTGC 1800 

TATCCAAGCC CATGGAGGAA AGTTACTGCT GAGGACCCAC CCAATGGAAG GATTCTTCTC 1860 

AGCCTTGACC CTGGAGCACT GGGAACAACT GGTCTCCTGT GATGGCTGGG ACTCCTCGCG 192 0 

GGAGGGGACT GCGCTGCTAT AGCTCTTGCT GCCTCTCTTG AATAGCTCTA ACTCCAAACC 198 0 

TCTGTCCACA CCTCCAGAGC ACCAAGTCCA GATTTGTGTG TAAGCAGCTG GGTGCCTGGG 2 04 0 

GCCTCTCGTG CACACTGGAT TGGTTTCTCA GTTGCTGGGC GAGCCTGTAC TCTGCCTGAC 2100 

GAGGAACGCT GGCTCCGAAG AGGCCCTGTG TAGAAGGCTG TCAGCTGCTC AGCCTGCTTT 2160 

GAGCCTCAGT GAGAAGTCCT TCCGACAGGA GCTGACTCAT GTCAGGATGG CAGGCCTGGT 2220 

ATCTTGCTCG GGCCCTAGCT GTTGGGGTTC TCATGGGTTG CACTGACCAT ACTGCTTACG 22 80 

TCTTAGCCAT TCCGTCCTGC TCCCCAGCTC ACTCTCTGAA GCACACATCA TTGGCTTTCC 2340 

TATTTTTCTG TTCATTTTTT AATTGAGCAA ATGTCTATTG AACACTTAAA ATTAATTAGA 24 00 

ATGTGGTAAT GGACATATTA CTGAGCCTCT CCATTTGGAA CCCAGTGGAG TTGGGATTTC 2460 

TAGACCCTCT TTCTGTTTGG ATGGTGTATG TGTATATGCA TGGGGAAAGG CACCTGGGGC 2520 

CTGGGGGAGG CTATAGGATA TAAGCATTAG GGACCCTGAG GCTTTAAGTG GTTTCTATTT 2 580 

CTTCTTAGTT ATTATGTGCC ACCTTCTTAG TTATTATGTG CCACCTCCCC TATGAGTGAC 2640 

GTGTTTGATC ACTAGCAGAA TAGCAAGCAG AGTATCATTC ATGCTGGGGC CAGAATGATG 2700 

GCCGGTTGCC AGATATAACT GCTTTGGAGC AAATCTCTTC TGTTTAGAGA GATAGAAGTT 2760 

ATGACATATG TAATACACAT CTGTGTACAC AGAAACCGGC ACCTGCCAGA CAGAGCTGGT 2820 

TCTAAGATTT AATACAGTGC TTTTTTTCCT CTTTGAAATA TTTTACTTTA ATACCAGTGC 2 880 

CTTTTCTTGT TGAACTTCTT GGAAAAGCCA CCAATTCTAG ATCTTGATTT GAATTAATAC 2 940 

AC AC AAT AT C TGAGACACTT ACACTTTTCA AAAGATTTGT GTATGCATTG CCTAATTAGA 3 000 

GTAGGGGGAG AAGGGCAACT ATTATTATCC CTATTTTACA AAACTGAGGC TTAGTGAGGT 3 060 

TCAGCCACAT GCCTAGACTT ATATACTAGT TAGTGGTGCA GCCAGGGAGA GGACTCAGAT 3120 

TTCCTGGAGG CAAAGTCTAT CTCTGAAACT CCATGAAGAC TTTTGCAGCC AGTTCCCACC 3180 

AATATGCCCC AGACGTGAGA CAAACAAGGA CTTTTTTTTT TATATAGAGC CATCCATAAA 3240 

ATCCTAAGCC CTTTTATTAA TGTATAACCA GGAGAACATC TGTGCCAACG GTTGGACTTT 33 00 

TTATGGCTGA GATTCGGGAG GAAGTGTGAC ACCAAGCAGG AGAGGAAGAA TGATTTTCTT 3360 

TGTACTTAGG TTTTCTAAGG ACATTGTTTT AATCTGTATC GTGCCAAAGT TGTATCACTG 3420 

TTAAACTTCT GAAGACATAA CCAGTTGAGT CTTATTTCAA GATATGTTCT CAAGCCAATT 34 80 

GTGTGCTTCT CTTGTTTCTG TGATTGCTTT CTAGCCAAAG CC-AAGCTTGT ACAGGTTGAG 3 540 

TATCCCTTAT CCAAAATGCT TGGAACCAGA AGTGTTTCAA ATTTTAGATT ATTTTCAGAT 3600 

TTTGGAATGT TTGCATATAC ATAATGAGAT ATTTTGGGAA TAGGACCCGA GCCTAAACAC 36S0 

AAAATTCATT GATGTGTCAG TTACACCTTA TCCACATAGC CTGAGGGTAA TTTTATACGA 3 720 

TATTTTAAAT AGTTGTGTAC ATGAAGCATG GTTTGTGGTA ACTTATGTGA GGGGTTTTCC 3 780 

CATTTTTTGT CTTGTTGGTG CTCAAAAAGT TTTGGATTTT GGAGCATTTC GGATTTTGGA 3840 

TTTTTGGATT AGGGTTGCTC AACCCATATT ATTGGCTGTA CATCCTGGTC ACTTCTGACT 3900 
TCTGTTTTTA CTAATGGAAG CTTTGCA 

Seq ID NO: 229 Protein sequence: 
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Protein Accession #: NP_006024.1 

1 11 21 31 41 51 

I 1 I I I I 

MSNSVPLLCF WSLCYCFAAG SPVPPGPEGR LEDKLHKPKA TQTEVKPSVR FNLRTSKDPE 60 

HEGCYLSVGH SQPLEDCSFH MTAKTFFIIH GWTM3GIFEN VILHKLVSALH TREKDANWV 120 

VDWLPLAHQL YTDAVNNTRV VGHSIARMLD WLQEKDDFSL GNVHLIGYSL GAHVAGYAGN 180 

FVKGTVGRIT GLDPAGPMFE GADIHKRLSP DDADFVDVLH TYTRSFGLSI GIQMPVGHID 240 

IYPHGGDFQP GCGLNDVLGS IAYGTITEW KCEH3RAVHL FVDSLVNQDK PSFAFQCTDS 300 

NRFKKGICLS CRKNRCNSIG YNAKKMRNKR NSKMYLKTRA GMPFRVYHYQ MKIHVFSYKN 36 0 

MGEIEPTFYV TLYGTNADSQ TLPBEIVERI EQKATNTFLV YTEEDLGDLL KIQLTWEGAS 420 

QSWYMLWKEF RSYLSQPRNP GRELNIRRIR VKSG3TQRKL TFCTEDPENT SISPGRELWF 480 
RKCRDGWRMK NETSPTVELP 
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It is understood that the examples described above in no way serve to limit the 
true scope of this invention, but rather are presented for illustrative purposes. All 
publications, sequences of accession numbers, and patent applications cited in this 
specification are herein incorporated by reference as if each individual publication or patent 
5 application were specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMEPIS; 

1 1 . A method of detecting an angiogenesis-associated transcript in a cell in 

2 a patient, the method comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridized to a sequence at least 80% identical to a sequence 

4 as shown in Tables 1-8. 

1 2. The method of claim 1 , wherein the biological sample is a tissue 

2 sample. 

1 3. The method of claim 1 , wherein the biological sample comprises 

2 isolated nucleic acids. 

1 4. The method of claim 3, wherein the nucleic acids are mRNA. 

1 5. The method of claim 3, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 6. The method of claim 1 , wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1-8 . 

1 7. The method of claim 1 , wherein the polynucleotide is labeled. 

1 8. The method of claim 7, wherein the label is a fluorescent label. 

1 9. The method of claim 1 , wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 1 0. The method of claim 1 , wherein the patient is undergoing a therapeutic 

2 regimen to treat a disease associated with angiongenesis. 

1 11. The method of claim 1 , wherein the patient is suspected of having 

2 cancer. 

1 12. An isolated nucleic acid molecule consisting of a polynucleotide 

2 sequence as shown in Tables 1-8. 

1 13. The nucleic acid molecule of claim 12, which is labeled. 

1 14. The nucleic acid of claim 13, wherein the label is a fluorescent label 
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15. An expression vector comprising the nucleic acid of claim 1 2 . 

16. A host cell comprising the expression vector of claim 1 5 . 

17. An isolated polypeptide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown in Tables 1-8 

18. An antibody that specifically binds a polypeptide of claim 1 7. 

19. The antibody of claim 18, further conjugated or fused to an effector 

component. 

20. The antibody of claim 19, wherein the effector component is a 
fluorescent label. 

2 1 . The antibody of claim 1 9, wherein the effector component is a 

radioisotope. 

22. The antibody of claim 19, which is an antibody fragment. 

23 . The antibody of claim 1 9, which is a humanized antibody 

24. A method of detecting a cell undergoing angiogenesis in a biological 
sample from a patient, the method comprising contacting the biological sample with an 
antibody of claim 18. 

25 . The method of claim 24, wherein the antibody is further conjugated or 
fused to an effector component. 

26. The method of claim 25, wherein the effector component is a 
fluorescent label. 

27. The method of detecting antibodies specific to angiogenesis in a 

2 patient, the method comprising contacting a biological sample from the patient with a 

3 polypeptide which is encoded by a nucleotide sequence of Tables 1 -8 . 



